Linux Commando: Use sed or perl to extract every nth line in a text file

Thursday, April 17, 2008

Use sed or perl to extract every nth line in a text file

I recently blogged about the use of sed to extract lines in a text file.

As examples, I showed some simple cases of using sed to extract a single line and a block of lines in a file.

An anonymous reader asked how one would extract every nth line from a large file.

Suppose somefile contains the following lines:

$ cat > somefile
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
line 9
line 10
$

Below, I show 2 ways to extract every 4th line: lines 4 and lines 8 in somefile.

sed
```
$ sed -n '0~4p' somefile
line 4
line 8
$
```
0~4 means select every 4th line, beginning at line 0.

Line 0 has nothing, so the first printed line is line 4.

-n means only explicitly printed lines are included in the output.
perl
```
$ perl -ne 'print ((0 == $. % 4) ? $_ : "")'  somefile
line 4
line 8
$
```
$. is the current input line number.

% is the remainder operator.

$_ is the current line.

The above perl statement prints out a line if its line number
can be evenly divided by 4 (remainder = 0).

Alternatively,
```
$ perl -ne 'print unless (0 != $. % 4)' somefile
line 4
line 8
$
```

Click here for a more recent post on sed tricks.

6 comments:

tsilver said...: Thank you. I don't use SED enough and this was a good reminder.; February 2, 2009 at 5:11 PM
Anonymous said...: Note that your last perl example (already much more readable than the 1st) can be further simplified to

perl -ne 'print unless ($. % 4)' somefile

Since in perl, 0 is false in a boolean context, the "0 != " test is redundant.; March 19, 2009 at 4:22 AM
Anonymous said...: I am sure the author knows that. Adding "0 != " adds clarity to the code and makes it readable, and it doesnt cost any extra machine cycles FYI!; December 7, 2009 at 11:36 PM
Anonymous said...: I am currently using exactly what you suggest in your sed example. My problem is that my file is quite large - almost 5 million lines. I also need certain blocks of lines, e.g. every other set of say 10 lines. So, I wrote a bash script for it, but it is taking a very long time. I am wondering if it is so, because although -n represses the output of the majority of the lines, it is still traversing them all. I don't know if this is true.
In any case - would you be able to suggest a more efficient way of doing what I am trying to do?; March 30, 2010 at 10:57 AM
Gopi said...: sed -n '3~3p'

the above command is not working. Its saying Unrecognized command:3~3P

Can you please Help me on this; June 18, 2012 at 4:00 AM
Unknown said...: Gopi, it is working for me.

[root@dachis-centos ~]# sed -n '3~3p' sample.sh
for i in {1..3}
done; September 3, 2012 at 6:42 AM

Search This Blog

Thursday, April 17, 2008

Use sed or perl to extract every nth line in a text file

6 comments: