CONTENTdm monthly pageviews far exceed expected amount
Symptom
- In the CONTENTdm server and collection reports, pageviews exceed previous months' or years' data by a significant amount.
Applies to
- CONTENTdm
Resolution
The increase in pageviews in CONTENTdm is likely due to bots harvesting data from public web pages. Websites with public domain or open access content have globally seen an uptick in such traffic—often from companies training their AI models. OCLC is aware of this data harvesting and is using various techniques to block these actors when they cause technical problems. However, harvesters are continually adapting their techniques to work around our efforts to block them.
OCLC is developing enhanced security features to address the recent increase in unauthorized traffic affecting CONTENTdm. These new security measures are being carefully designed and will be implemented gradually. OCLC leverages advanced technologies and specialized expertise to prevent unauthorized traffic from reaching our public-facing systems. These enhancements reflect our ongoing commitment to maintaining the integrity, performance, and security of OCLC systems.
The basic reporting available in CONTENTdm does not allow for interpretation of the pageview data, however CONTENTdm supports Google Analytics 4 integration which is capable of more granularity in the reporting and would provide additional insight.