The internet is overflowing with news, but extracting clean, readable content from articles can be a tedious task. Whether you’re aggregating news for personal consumption, research, or AI training, automating this process is a must. Enter Uninews, a powerful, lightweight, and efficient Rust-based news scraper that simplifies content extraction and conversion into Markdown format.
What is Uninews?
Uninews is a universal news scraper that downloads an article from a given URL, cleans up the HTML, and formats the content into Markdown using OpenAI’s GPT-4o via CloudLLM. The final output is a structured JSON response containing:
- Title of the article
- Markdown-formatted content
- Featured image URL
When used as a command-line tool, Uninews simply outputs the extracted Markdown, making it easy to read or integrate into your workflow.
Key Features
✅ Smart Content Extraction: Targets <article>
tags to get the main content, falling back to <body>
if needed.
✅ Clean Markdown Conversion: Uses GPT-4o (via CloudLLM) to generate clean, structured Markdown from raw HTML.
✅ Reusable Rust Library: The universal_scrape
function can be integrated into any Rust project.
✅ Multilingual Support: Specify a language for the output, defaulting to English.
Installation
You need Rust and Cargo installed to get started.
Install via Cargo
cargo install uninews
Or Build from Source
git clone https://github.com/gubatron/uninews.git cd uninews make build make install
Running Uninews
Before running Uninews, set your OpenAI API key:
export OPEN_AI_SECRET=sk-xxxxxxxxxxxxxxxxxxxxxxxxxx
Then, scrape a news article:
uninews https://example.com/news-article
You can also specify the output language:
uninews -l spanish https://example.com/news-article
Command-line Options
Usage: uninews [OPTIONS] <URL>
Arguments:
<URL> The URL of the news article to scrape
Options:
-l, --language <LANGUAGE> Output language (default: English)
-h, --help Print help
-V, --version Print version
Integrating Uninews in Your Rust Project
Uninews can be used as a library to scrape news articles programmatically:
use uninews::{universal_scrape, Post};
// Scrape and convert a news article into Markdown
let post = universal_scrape("https://example.com/news", "english").await;
if !post.error.is_empty() {
eprintln!("Error: {}", post.error);
return;
}
println!("{}\n\n{}", post.title, post.content);
Make sure to set the OpenAI API key before calling universal_scrape
:
std::env::set_var("OPEN_AI_SECRET", my_open_ai_secret);
Why Use Uninews?
🚀 Fast: Written in Rust for optimal performance.
🛠 Easy to Use: Simple CLI and library interface.
📖 Readable Output: Well-formatted Markdown conversion.
🔄 Reusable: Works as both a command-line tool and a Rust library.
License
Uninews is open-source and licensed under MIT License.
Copyright (c) 2025 Ángel León.