ccwarcs R package

This R package allows you to access Common Crawl (CC) WARC files via Amazon Web Services (AWS). It includes interfaces to search the CC Index Server and download WARCS via AWS. Both interfaces are slow, hence the package also provides a caching mechanism.