<aside>
đź’ˇ Try it out at readtimelines.com
</aside>
What’s this?
- readtimelines.com is a tool for understanding the big picture of an event/topic from multiple viewpoints.
- You can search for events (”Trump Indictment”) or broader topics (”Singapore Economy”), and timelines will serve you events in chronological order.
- An event consists of articles detailing the same happening, from various high-quality news sources across the political spectrum.
- Our aim is to help the reader gain a balanced, nuanced view of the issue at hand.
How it works
- Every day, we scrape 1000-3000 articles from various sources (e.g. NYTimes, Washington Post, etc), embed them with
all-MiniLM-L6-v2
, and store them in a full-text search instance (Meilisearch)
- When you search for something on Timelines, it queries meilisearch for articles.
- These articles are fed into our machine learning pipeline. The pipeline takes in articles and returns events
- The machine learning pipeline
- Split articles into sliding windows
- Clusters each window on their embeddings using OPTICS.
- Discards the largest cluster in each window. The largest cluster is often just noise.
- Each cluster of articles forms an event. We will sort the events chronologically, and return it as a timeline.
FAQ
- Why build our own data pipeline?
- What does the data pipeline look like?
- Why Meilisearch?
- Why OPTICS?