Wednesday, June 18, 2008

Create File of a Given Size ... with random contents

A while back, I wrote about how to create a zero-filled file of any arbitrary size. This is part 2 where I share how to create a file of random contents (not just zeroes).

Recently, I ran into a situation where a zero-filled file is insufficient. I needed to create a log file of size 2 MB in order to be zipped up and copied to another server.

To create the 2MB file (with all zeroes), I run the dd command:
$ dd if=/dev/zero of=a.log bs=1M count=2

I quickly realized that the test result would be invalid because zipping a all-zero file dramatically reduced its size.

$ gzip a.log
$ ls -hl a.log*
-rw-r--r-- 1 peter peter 2.1K 2008-06-14 14:36 a.log.gz
$


I decided to create a 2 MB file of random contents instead. This is how.

$ dd if=/dev/urandom of=a.log bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 1.00043 seconds, 2.1 MB/s
$ gzip a.log
$ ls -hl a.log*
-rw-r--r-- 1 peter peter 2.1M 2008-06-14 14:43 a.log.gz
$


To take a look at the random contents of a.log, use the hexdump command:

$ hexdump a.log |head
0000000 c909 2da7 4a77 22fc 88b6 b394 be42 b0c1
0000010 1531 f9d5 4b3d 390d e670 da2c e7e9 b681
0000020 0518 2b5d 5a66 ef76 c297 7f73 2d0b 453e
0000030 ba47 c268 26f9 79b5 1816 82ac 2e76 0ff2
0000040 c1e8 e14f 898f 2507 9c29 83b7 226c 0d65
0000050 f3f6 6eb4 62d9 410b b566 c522 ffca fbac
0000060 81f6 d91c dd34 18cd f873 8073 fa02 20c1
0000070 06bb 7e32 dc2e 13b2 a345 aadd 8700 fa9e
0000080 e28e 1b58 c25f 4619 c8bc 8110 6306 a2fc
0000090 9766 d98f 648e cec7 d654 2eaa 1f6f 839f

12 comments:

Sony Jose said...

Thanks! It was useful :)

Syclone0044 said...

Nice blog - you're the #1 Google result when I searched today for "linux command to generate random file" and you delivered exactly what I needed. I was using /dev/random which for some reason doesnt return to the command prompt. Your command worked fast and gave me the file I needed to test my new web host's download speeds. Thanks.

Anonymous said...

To Syclone0044: /dev/random was not returning for you simply because not enough entropy had yet been generated. /dev/random delivers only "pure" random data, which is very, very slow to accumulate on a normal system.

/dev/urandom, by contrast, delivers semi-random data generated by a PRNG which is fed by the trickle of real entropy from /dev/random.

For pretty much all intents and purposes, you are better off using /dev/urandom for your application, as the data generated should be random enough and it is perhaps a thousand times (or more) faster.

Zach said...

Thank you, this was just what I was looking for!

robotshoelaces said...

Thanks! I was using /dev/random, too. I wasn't getting the file sizes I wanted and it was hella slow.

/dev/urandom let me do exactly what I wanted: Generate 5KB keyfiles for Truecrypt volumes!

Unknown said...

Glad this was here. Thanks for the help.

icey said...

Nice was just what i was looking for :)

However in MacOSX it didnt recognize 1M
i used 1024000 instead...

Thanks !!

Anonymous said...

Zipping random information is an equally invalid test as zipping an all-0 file, as generally speaking purely random data won't compress at all (more or less).

Tim T. said...

Thanks, man!

Martin said...

Cheers! Just what I was looking for :)

Anonymous said...

fascinating stuff!

learn more about the kernel random source devices with (on your favorite linux distro): 'man urandom'.

Anonymous said...

Thanks!