Sha256: 7b23933d0701e0b071f546a749862421fb42611462c0867d956e18abd84dc84e

Contents?: true

Size: 1.08 KB

Versions: 1

Compression:

Stored size: 1.08 KB

Contents

module Ebooks
  module Generator

    def self.generate_twitter_corpus(tweets_csv_path = 'tweets.csv', corpus_path = 'markov_dict.txt')
      # Go to Twitter.com -> Settings -> Download Archive.
      # This tweets.csv file is in the top directory. Put it in the same directory as this script.
      csv_text = CSV.parse(File.read(tweets_csv_path))

      # Create a new clean file of text that acts as the seed for your Markov chains
      File.open(corpus_path, 'w') do |file|
        csv_text.reverse.each do |row|
          # Strip links and new lines
          tweet_text = row[5].gsub(/(?:f|ht)tps?:\/[^\s]+/, '').gsub(/\n/,' ')
          # Save the text
          file.write("#{tweet_text}\n")
        end
      end
    end

    def self.generate_sentence(corpus_path = 'markov_dict.txt')
      # Run when you want to generate a new Markov tweet
      markov = MarkyMarkov::Dictionary.new('dictionary') # Saves/opens dictionary.mmd
      markov.parse_file(corpus_path)
      tweet_text = markov.generate_n_sentences(2).split(/\#\</).first.chomp.chop
      markov.save_dictionary!
    end

  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
ebooks-0.0.1 lib/ebooks/generator.rb