Extractive Summarization
An NLP technique that creates summaries by selecting and combining the most important sentences directly from the source text, without generating new text.
How It Works
Score each sentence in the source document by importance (using TF-IDF, TextRank, or neural models). Select top-scoring sentences while avoiding redundancy. Optionally reorder selected sentences for coherence.
Methods
TextRank: Graph-based, similar to PageRank for sentences. BERT-based: BertSum scores sentences using transformer representations. Lead-N: Simple baseline taking the first N sentences (surprisingly effective for news articles).
Comparison with Abstractive
Extractive: faithful to source but can be choppy. Abstractive: more fluent but may introduce errors. Modern systems often combine both: extract key sentences, then abstractively refine them.