User Controls

Creating a natural language generation news bot.... erm

  1. #1
    filtration African Astronaut
    So, I'm currently scraping a bunch of news websites, and up to now I have 100,000 articles. What are we looking at to get a semi-decent dataset for a NLG story? Thousands, Millions, hundreds of millions of articles?

    I'll release the dataset once I've scraped a dataset worth while, I'll share the byte pair encoded file too.
  2. #2
    Sophie Pedophile Tech Support
    The more the better obviously but you should scale to the infrastructure you have available. Which language are you employing for this project? Python?
Jump to Top