Delete multiple columns using awk or sedsplit string with awk and delimiterUsing Regex Breaking a text on the last digit using linux tools like sed, or awkcsv file adding and removing characters from rowsCount the number of unique values based on two columns in a spreadsheetProblem extracting data from file using awkReplacing a Substring with sedawk - compare 2 files and print columns from both filesBash help: awk columnsRemoving multiple space using sedDelete 'N' no lines only on the Nth occurrence of a pattern in a file using the sed/awk command
The use of multiple foreign keys on same column in SQL Server
How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?
Can I make popcorn with any corn?
Why don't electron-positron collisions release infinite energy?
What's the point of deactivating Num Lock on login screens?
Font hinting is lost in Chrome-like browsers (for some languages )
Show that if two triangles built on parallel lines, with equal bases have the same perimeter only if they are congruent.
Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?
What are the differences between the usage of 'it' and 'they'?
Python: next in for loop
To string or not to string
Email Account under attack (really) - anything I can do?
Collect Fourier series terms
strToHex ( string to it's hex representation as string)
What does it mean to describe someone as a butt steak?
Mathematical cryptic clues
What do the dots in this tr command do: tr .............A-Z A-ZA-Z <<< "JVPQBOV" (with 13 dots)
How to find program name(s) of an installed package?
How to format long polynomial?
How to write a macro that is braces sensitive?
How can I make my BBEG immortal short of making them a Lich or Vampire?
In Japanese, what’s the difference between “Tonari ni” (となりに) and “Tsugi” (つぎ)? When would you use one over the other?
Has the BBC provided arguments for saying Brexit being cancelled is unlikely?
I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine
Delete multiple columns using awk or sed
split string with awk and delimiterUsing Regex Breaking a text on the last digit using linux tools like sed, or awkcsv file adding and removing characters from rowsCount the number of unique values based on two columns in a spreadsheetProblem extracting data from file using awkReplacing a Substring with sedawk - compare 2 files and print columns from both filesBash help: awk columnsRemoving multiple space using sedDelete 'N' no lines only on the Nth occurrence of a pattern in a file using the sed/awk command
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a database with 6037 space-separated columns and 450 rows like the one below:
1807 1452 1598 1 6.655713 A B A B ... 0
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B
I want to get a new database with only the first 676 columns.
Preferably, some form that uses awk
or sed
command.
text-processing sed awk
add a comment |
I have a database with 6037 space-separated columns and 450 rows like the one below:
1807 1452 1598 1 6.655713 A B A B ... 0
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B
I want to get a new database with only the first 676 columns.
Preferably, some form that uses awk
or sed
command.
text-processing sed awk
add a comment |
I have a database with 6037 space-separated columns and 450 rows like the one below:
1807 1452 1598 1 6.655713 A B A B ... 0
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B
I want to get a new database with only the first 676 columns.
Preferably, some form that uses awk
or sed
command.
text-processing sed awk
I have a database with 6037 space-separated columns and 450 rows like the one below:
1807 1452 1598 1 6.655713 A B A B ... 0
1808 1452 1763 1 9.362033 0 0 A B ... A
1809 1452 1527 2 6.728534 A B A A ... B
1810 1452 1367 2 9.4055 A B A A B ... A
... ... ... ... ... ... ... ... ... ...
1812 1452 1258 1 6.363032 0 0 A B ... B
I want to get a new database with only the first 676 columns.
Preferably, some form that uses awk
or sed
command.
text-processing sed awk
text-processing sed awk
edited Mar 21 at 23:36
dessert
25.4k673107
25.4k673107
asked Mar 21 at 21:51
andrecandrec
161
161
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
If the column delimiter in your file is a single character, e.g. a space, cut
can do that easily:
cut -d' ' -f-676 <in >out
This prints only the space-separated columns from the first to the 676th.
If you need e.g. every whitespace character to count as a delimiter, a sed
solution is:
sed -r 's/s+S+//677g' <in >out
This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:
sed -r 's/[4#K]+[^4#K]+//677g' <in >out
For a reasonable awk
approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS
) if their number is <= 676:
awk 'for (i=1;i<=676;i++) printf (i==1?"":FS)$i; print ""' <in >out
For a character group you have to specify the output field separator for the output, e.g. for [4#K]
and "sep"
:
awk -F'[4#K]' 'for (i=1;i<=676;i++) printf (i==1?"":"sep")$i; print ""' <in >out
add a comment |
For a single-character delimiter (such as space or comma) I would recommend using the cut
command over either awk
or sed
.
However since you asked about awk
specifically, I think a reasonable way to do it would be to decrement the field count:
awk -v last=676 'while(NF>last) NF-- 1' datafile
Tested in GNU Awk (gawk
) and mawk
.
1
Why not justNF = last; print
instead of the loop?
– wchargin
Mar 22 at 4:36
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
add a comment |
You could use
mlr --nidx --fs ' ' --repifs cat inputFile.csv | cut -d ' ' -f-2
In this way with mlr (https://github.com/johnkerl/miller/releases/tag/5.4.0) you manage field separators (if you have more than one spaces, they become one per field), and with cut you extract (in my example) the first two fields.
From
1807 1452 1598 1 6.655713 A B A B
1808 1452 1763 1 9.362033 0 0 A B
1809 1452 1527 2 6.728534 A B A A
1810 1452 1367 2 9.4055 A B A A B
to
1807 1452
1808 1452
1809 1452
1810 1452
Some notes about Miller options:
--nidx
is to set the format; this is a generic index-numbered table (the first field is 1, the second is 2, ecc..);--fs
to set the separator (here is a space);--repifs
means that multiple successive occurrences of the field separator count as onecat
passes input records directly to output.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "89"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1127670%2fdelete-multiple-columns-using-awk-or-sed%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
If the column delimiter in your file is a single character, e.g. a space, cut
can do that easily:
cut -d' ' -f-676 <in >out
This prints only the space-separated columns from the first to the 676th.
If you need e.g. every whitespace character to count as a delimiter, a sed
solution is:
sed -r 's/s+S+//677g' <in >out
This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:
sed -r 's/[4#K]+[^4#K]+//677g' <in >out
For a reasonable awk
approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS
) if their number is <= 676:
awk 'for (i=1;i<=676;i++) printf (i==1?"":FS)$i; print ""' <in >out
For a character group you have to specify the output field separator for the output, e.g. for [4#K]
and "sep"
:
awk -F'[4#K]' 'for (i=1;i<=676;i++) printf (i==1?"":"sep")$i; print ""' <in >out
add a comment |
If the column delimiter in your file is a single character, e.g. a space, cut
can do that easily:
cut -d' ' -f-676 <in >out
This prints only the space-separated columns from the first to the 676th.
If you need e.g. every whitespace character to count as a delimiter, a sed
solution is:
sed -r 's/s+S+//677g' <in >out
This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:
sed -r 's/[4#K]+[^4#K]+//677g' <in >out
For a reasonable awk
approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS
) if their number is <= 676:
awk 'for (i=1;i<=676;i++) printf (i==1?"":FS)$i; print ""' <in >out
For a character group you have to specify the output field separator for the output, e.g. for [4#K]
and "sep"
:
awk -F'[4#K]' 'for (i=1;i<=676;i++) printf (i==1?"":"sep")$i; print ""' <in >out
add a comment |
If the column delimiter in your file is a single character, e.g. a space, cut
can do that easily:
cut -d' ' -f-676 <in >out
This prints only the space-separated columns from the first to the 676th.
If you need e.g. every whitespace character to count as a delimiter, a sed
solution is:
sed -r 's/s+S+//677g' <in >out
This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:
sed -r 's/[4#K]+[^4#K]+//677g' <in >out
For a reasonable awk
approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS
) if their number is <= 676:
awk 'for (i=1;i<=676;i++) printf (i==1?"":FS)$i; print ""' <in >out
For a character group you have to specify the output field separator for the output, e.g. for [4#K]
and "sep"
:
awk -F'[4#K]' 'for (i=1;i<=676;i++) printf (i==1?"":"sep")$i; print ""' <in >out
If the column delimiter in your file is a single character, e.g. a space, cut
can do that easily:
cut -d' ' -f-676 <in >out
This prints only the space-separated columns from the first to the 676th.
If you need e.g. every whitespace character to count as a delimiter, a sed
solution is:
sed -r 's/s+S+//677g' <in >out
This replaces every column (= at least one whitespace character followed by at least one non-whitespace character) beginning with the 677th with nothing. Using character groups you can specify any set of delimiters you need, e.g. for “4”, “#” and “K”:
sed -r 's/[4#K]+[^4#K]+//677g' <in >out
For a reasonable awk
approach kindly refer to steeldriver’s answer, but here is another one looping over the columns and only printing them (separated by FS
) if their number is <= 676:
awk 'for (i=1;i<=676;i++) printf (i==1?"":FS)$i; print ""' <in >out
For a character group you have to specify the output field separator for the output, e.g. for [4#K]
and "sep"
:
awk -F'[4#K]' 'for (i=1;i<=676;i++) printf (i==1?"":"sep")$i; print ""' <in >out
edited Mar 21 at 22:54
answered Mar 21 at 22:04
dessertdessert
25.4k673107
25.4k673107
add a comment |
add a comment |
For a single-character delimiter (such as space or comma) I would recommend using the cut
command over either awk
or sed
.
However since you asked about awk
specifically, I think a reasonable way to do it would be to decrement the field count:
awk -v last=676 'while(NF>last) NF-- 1' datafile
Tested in GNU Awk (gawk
) and mawk
.
1
Why not justNF = last; print
instead of the loop?
– wchargin
Mar 22 at 4:36
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
add a comment |
For a single-character delimiter (such as space or comma) I would recommend using the cut
command over either awk
or sed
.
However since you asked about awk
specifically, I think a reasonable way to do it would be to decrement the field count:
awk -v last=676 'while(NF>last) NF-- 1' datafile
Tested in GNU Awk (gawk
) and mawk
.
1
Why not justNF = last; print
instead of the loop?
– wchargin
Mar 22 at 4:36
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
add a comment |
For a single-character delimiter (such as space or comma) I would recommend using the cut
command over either awk
or sed
.
However since you asked about awk
specifically, I think a reasonable way to do it would be to decrement the field count:
awk -v last=676 'while(NF>last) NF-- 1' datafile
Tested in GNU Awk (gawk
) and mawk
.
For a single-character delimiter (such as space or comma) I would recommend using the cut
command over either awk
or sed
.
However since you asked about awk
specifically, I think a reasonable way to do it would be to decrement the field count:
awk -v last=676 'while(NF>last) NF-- 1' datafile
Tested in GNU Awk (gawk
) and mawk
.
edited Mar 21 at 23:19
answered Mar 21 at 22:45
steeldriversteeldriver
70.6k11115187
70.6k11115187
1
Why not justNF = last; print
instead of the loop?
– wchargin
Mar 22 at 4:36
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
add a comment |
1
Why not justNF = last; print
instead of the loop?
– wchargin
Mar 22 at 4:36
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
1
1
Why not just
NF = last; print
instead of the loop?– wchargin
Mar 22 at 4:36
Why not just
NF = last; print
instead of the loop?– wchargin
Mar 22 at 4:36
1
1
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
@wchargin Doh! yes that's much better - wanna post it as an answer?
– steeldriver
Mar 22 at 4:49
add a comment |
You could use
mlr --nidx --fs ' ' --repifs cat inputFile.csv | cut -d ' ' -f-2
In this way with mlr (https://github.com/johnkerl/miller/releases/tag/5.4.0) you manage field separators (if you have more than one spaces, they become one per field), and with cut you extract (in my example) the first two fields.
From
1807 1452 1598 1 6.655713 A B A B
1808 1452 1763 1 9.362033 0 0 A B
1809 1452 1527 2 6.728534 A B A A
1810 1452 1367 2 9.4055 A B A A B
to
1807 1452
1808 1452
1809 1452
1810 1452
Some notes about Miller options:
--nidx
is to set the format; this is a generic index-numbered table (the first field is 1, the second is 2, ecc..);--fs
to set the separator (here is a space);--repifs
means that multiple successive occurrences of the field separator count as onecat
passes input records directly to output.
add a comment |
You could use
mlr --nidx --fs ' ' --repifs cat inputFile.csv | cut -d ' ' -f-2
In this way with mlr (https://github.com/johnkerl/miller/releases/tag/5.4.0) you manage field separators (if you have more than one spaces, they become one per field), and with cut you extract (in my example) the first two fields.
From
1807 1452 1598 1 6.655713 A B A B
1808 1452 1763 1 9.362033 0 0 A B
1809 1452 1527 2 6.728534 A B A A
1810 1452 1367 2 9.4055 A B A A B
to
1807 1452
1808 1452
1809 1452
1810 1452
Some notes about Miller options:
--nidx
is to set the format; this is a generic index-numbered table (the first field is 1, the second is 2, ecc..);--fs
to set the separator (here is a space);--repifs
means that multiple successive occurrences of the field separator count as onecat
passes input records directly to output.
add a comment |
You could use
mlr --nidx --fs ' ' --repifs cat inputFile.csv | cut -d ' ' -f-2
In this way with mlr (https://github.com/johnkerl/miller/releases/tag/5.4.0) you manage field separators (if you have more than one spaces, they become one per field), and with cut you extract (in my example) the first two fields.
From
1807 1452 1598 1 6.655713 A B A B
1808 1452 1763 1 9.362033 0 0 A B
1809 1452 1527 2 6.728534 A B A A
1810 1452 1367 2 9.4055 A B A A B
to
1807 1452
1808 1452
1809 1452
1810 1452
Some notes about Miller options:
--nidx
is to set the format; this is a generic index-numbered table (the first field is 1, the second is 2, ecc..);--fs
to set the separator (here is a space);--repifs
means that multiple successive occurrences of the field separator count as onecat
passes input records directly to output.
You could use
mlr --nidx --fs ' ' --repifs cat inputFile.csv | cut -d ' ' -f-2
In this way with mlr (https://github.com/johnkerl/miller/releases/tag/5.4.0) you manage field separators (if you have more than one spaces, they become one per field), and with cut you extract (in my example) the first two fields.
From
1807 1452 1598 1 6.655713 A B A B
1808 1452 1763 1 9.362033 0 0 A B
1809 1452 1527 2 6.728534 A B A A
1810 1452 1367 2 9.4055 A B A A B
to
1807 1452
1808 1452
1809 1452
1810 1452
Some notes about Miller options:
--nidx
is to set the format; this is a generic index-numbered table (the first field is 1, the second is 2, ecc..);--fs
to set the separator (here is a space);--repifs
means that multiple successive occurrences of the field separator count as onecat
passes input records directly to output.
edited Mar 22 at 9:04
answered Mar 22 at 7:13
aborrusoaborruso
20115
20115
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1127670%2fdelete-multiple-columns-using-awk-or-sed%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown