Tar.xz of metadata from Wikipedia From 83 GB Wikipedia download From words files Each file contains 650 lines in which a given word appears