Wikipedia
From iPodLinux
This is about getting Wikipedia downloaded and readable on the iPod. At the moment all progress is in text format, but robamler is planning a ebook viewer that should be superior in that it will have links and compression.
Contents |
Howto
Here are instructions on how to convert some or all of Wikipedia. This is for Linux users, but if you are a smart Mac or Windows user, you should be able to figure it out for yourself.
First, download the latest Wikipedia MySQL dump from http://dumps.wikimedia.org (download cur under en.wikipedia).
This will take a while, giving you time to install mediawiki...
Download the latest version of mediawiki to your htdocs folder.
Extract it:
cd /var/www/localhost/htdocs/ tar -xvzpf mediawiki-*.tar.gz
Rename the extracted directory wiki, and set wiki/config/ executable:
mv mediawiki-*/ wiki chmod +x wiki/config
In your favorite web browser, visit http://localhost/wiki/config/. Fill out everything as you wish (I'm not helping you here), make note of what you name the database. I named my database enwiki, so I'll be using that in my examples. When done, copy config/LocalSettings.php to the main wiki directory.
cp wiki/config/LocalSettings.php wiki/
Create the file wiki/skins/Simple.php with this as it's contents:
<?php
if( !defined( 'MEDIAWIKI' ) )
die();
class SkinSimple extends Skin {
function initPage() {}
function getStylesheet() {return '';}''
function getSkinName() {return "simple";}
function doBeforeContent() {
$s = "\n<div id='content'>\n<div id='topbar'>";
$s .= $this->pageTitle() . $this->pageSubtitle() . "\n";
$s .= "<br /><br />\n</div>\n\n<div id='article'>";
return $s;
}
function topLinks() {return '';}''
function doAfterContent() {return '';}''
function printSource() {return '';}''
}
?>
Set simple as the default skin in wiki/LocalSettings.php:
... $wgDefaultSkin = 'simple'; ...
When you are finished downloading extract the database dump, and restore it to your database:
gunzip *_cur_table.sql.gz mysql -uroot -ppassword enwiki < *_cur_table.sql
Visit the wiki at http://localhost/wiki to make sure it's working.
Now for the useful part. Make a new folder somewhere to store everything.
mkdir ~/wikipedia cd ~/wikipedia
Create a file named dumpwiki.sh with this as it's contents:
#!/bin/sh
MYSQL="mysql -uroot -pkufuku --batch -e"
WIKIPEDIA="http://localhost/wiki/index.php"
LETTER="." #any letter
#LETTER="a" #only 'A'
#LETTER="[a-c]" #'A', 'B' or 'C'
$MYSQL "use enwiki; SELECT cur_title from cur" \
| grep -iv "[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*" \
| grep -iv "\!" \
| grep -iv "Requests_for_" \
| grep -iv "votes_for" \
| grep -iv "votes_on" \
| grep -iv "needing_votes" \
| grep -iv "images_" \
| grep -iv \" \
| grep -iv \' \
| grep -iv "\....$" \
| grep -iv '\%' \
| grep -iv '\$' \
| grep -iv '\&' | grep -iv '\?' \
| grep -iv "cur_title" \
| grep -iv '\-' \
| grep -iv '\*' \
| grep -iv '^\.' \
| grep -iv '\:' \
| grep -iv '\/' \
| grep -i "^$LETTER" \
> wikititles
mkdir -p articles
cd articles
rm -f /tmp/wikiart
for name in `cat ../wikititles`; do
echo $name
if [ ! -f $name ]
then
if wget -nv "$WIKIPEDIA/$name"
then
html2text -ascii -nobs $name > /tmp/wikiart
rm -f $name
mv /tmp/wikiart $name
else
rm -f $name;
fi
fi
done
if [[ "$LETTER" == "." ]]; then
tar -cjf "../all.tar.bz2" .
else
rm -f ../$LETTER.tar ../$LETTER.tar.bz2
find ./ -type f | grep -i "^./$LETTER" | tar -cjf ../$LETTER.tar.bz2 -T -
fi
Run the file:
sh dumpwiki.sh
Any questions please email timmyisdaman@gmail.com.
Articles
If you do not want to dump the articles yourself, all of the articles have been dumped and and are available here.
Limitations
iPodlinux
- No hyperlinks
- Can't fit all the files in one directory
Rockbox
- No hyperlinks
- Can't fit all the files in one directory
Apple Firmware
- The iPod software only reads the first 4 KB of any text file in the Notes/ directory. The remainder of the file is ignored.
- The iPod software only reads a total of 1000 text files from the Notes/ directory. While it's easily possible to place a lot more text files on your iPod, the software will only read the first 1000 files and ignore the rest.
These restrictions apply to all iPod models, including the newer fifth and six generation iPods.