New text compression record

Posted on Sunday, Jul 15 2007 @ 11:02 CEST by Thomas De Maesschalck
Slashdot writes a guy called Alexander Ratushnyak has set a new text compression record.

He managed to compress the first 100,000,000 bytes of Wikipedia to a record-small 16,481,655 bytes (including the decompression program):
Thereby he not only won the second payout of The Hutter Prize for Compression of Human Knowledge, but also brought text compression within 1% of the threshold for artificial intelligence. Achieving 1.319 bits per character, this makes the next winner of the Hutter Prize likely to reach the threshold of human performance (between 0.6 and 1.3 bits per character) estimated by the founder of information theory, Claude Shannon and confirmed by Cover and King in 1978 using text prediction gambling. When the Hutter Prize started, less than a year ago, the best performance was 1.466 bits per character. >/blockquote>

About the Author

Thomas De Maesschalck

Thomas has been messing with computer since early childhood and firmly believes the Internet is the best thing since sliced bread. Enjoys playing with new tech, is fascinated by science, and passionate about financial markets. When not behind a computer, he can be found with running shoes on or lifting heavy weights in the weight room.

Loading Comments