Skip to content
TrustYourWebsite

How Our Scanner Detects Copyrighted Images on Your Website

6 April 2026

Every image on your website has a legal status. It's either properly licensed, in the public domain, or a liability waiting to become expensive. The problem is that most business owners have no idea which category their images fall into.

Copyright enforcement agencies know this. Companies like Getty Images, CopyTrack and PicRights run automated crawlers that scan millions of websites for unlicensed images. When they find one, they send a demand letter. Typical claims range from 500 to 5,000 euros per image. Three unlicensed images means potentially 15,000 euros.

Our scanner finds these problems before the enforcement agencies do. Here's what it checks and how it works.

The financial risk is real

Image copyright claims are not theoretical. They're one of the most common legal issues facing small business websites in Europe.

The enforcement model is straightforward. Stock photo agencies like Getty Images and Shutterstock invest in building image catalogues. Photographers upload their work. The agencies license those images to businesses. When someone uses an image without paying, the agency loses revenue. Enforcement companies recover that revenue, often with penalties on top.

A single Getty Images demand letter typically asks for 800 to 3,000 euros. If you ignore a copyright demand letter, the amount grows as legal fees and interest accumulate. What started as a 1,000 euro claim can reach 5,000 euros by the third follow-up.

Since the Getty-Shutterstock merger, enforcement has intensified. CopyTrack and PicRights operate on commission, which means they're financially motivated to find every unlicensed image they can.

Most of the time, the business owner didn't choose the image. A web designer grabbed it from Google. A WordPress theme included it. An employee added it to a blog post. None of that matters legally. The website owner pays. That's a reality we cover in detail in our guide on web designer copyright liability.

How enforcement agencies find your images

Understanding how the other side works puts our scanner in context.

Automated web crawlers. Getty, CopyTrack and PicRights run bots that visit public websites, download images and compare them against their catalogues. No human reviews your website. A bot finds a match and triggers the demand letter.

Perceptual hashing. Each catalogued image gets a digital fingerprint based on its visual content. Unlike a regular file hash, this fingerprint survives cropping, resizing, format conversion and even horizontal flipping. If the core visual content matches, the fingerprint matches.

Reverse image search at scale. Services like TinEye index billions of images. Enforcement agencies upload their catalogues and get a list of every website using them.

Web archive evidence. Agencies use the Wayback Machine and their own crawlers to document unauthorized use with timestamps. Removing an image today doesn't erase the evidence that it was there last month.

Our scanner uses similar techniques, but on your behalf. The goal is to find what enforcement agencies would find, and flag it before they send you a bill.

What the scanner checks

The scanner runs three layers of image copyright detection. Each layer catches different types of risk.

Layer 1: Stock photo CDN pattern matching

The fastest and most reliable check. When you use a stock photo without a license, the image URL often still points to the stock agency's own servers. This happens more often than you'd expect โ€” web designers hotlink images directly from preview pages, or CMS plugins embed thumbnails from the original source.

The scanner matches every image URL on your page against known stock photo CDN patterns. It checks for:

  • Getty Images (media.gettyimages.com)
  • iStock (media.istockphoto.com)
  • Shutterstock (image.shutterstock.com)
  • Adobe Stock (stock.adobe.com)
  • Dreamstime (thumbs.dreamstime.com)
  • Depositphotos (st*.depositphotos.com)
  • 123RF (previews.123rf.com)
  • Alamy (media.alamy.com)
  • Flickr (c*.staticflickr.com)

A match here is a strong signal. It means your website is loading an image directly from a stock agency's content delivery network. That image is almost certainly copyrighted, and unless you have a license that permits hotlinking (most don't), you have a problem.

The scanner also checks for URL patterns that suggest watermarked or preview images โ€” URLs containing words like "watermark," "preview," "comp," "sample" or "draft." These indicate someone downloaded or linked a preview version instead of purchasing a license.

Layer 2: EXIF and IPTC metadata analysis

Digital images contain embedded metadata. Professional stock photos typically include copyright information in their EXIF and IPTC data fields: the photographer's name, the agency, licensing terms and sometimes the specific image ID.

The scanner downloads up to 20 candidate images from your page and reads their metadata. It looks for stock agency names in the Copyright, Credit, Source and ImageDescription fields. A match against Getty, Shutterstock, Adobe Stock, Alamy, Dreamstime or iStock in any of these fields means the image originated from a commercial source.

Even without a specific agency match, the scanner flags images that contain generic copyright notices like "All Rights Reserved" or a copyright symbol. These aren't necessarily stock photos โ€” they could be images from any photographer โ€” but they indicate someone asserted ownership, and you should verify your right to use them.

There's a limitation here. Many websites strip EXIF data during upload, either deliberately or as a side effect of image optimization. WordPress, Squarespace and most modern CMS platforms remove metadata by default. If the metadata has been stripped, this check can't find anything. That doesn't mean the image is safe. It just means this particular detection method has nothing to work with.

Layer 3: Visual web detection (premium scans)

For premium scans, the scanner uses Google Cloud Vision's web detection API to check where else an image appears on the internet. This is the closest thing to what enforcement agencies do with their own fingerprinting systems.

The scanner uploads your images to the Vision API, which returns a list of websites where visually matching images appear. If your image shows up on gettyimages.com, shutterstock.com, or any other stock agency domain, the scanner flags it as high risk. If the image appears on 20 or more websites without a stock agency match, it gets a medium risk flag โ€” it's widely distributed, which often means it's either a popular stock photo or a viral image with unclear licensing.

Images from safe sources like Unsplash, Pexels and Pixabay are excluded from flagging, since these platforms offer free commercial licenses.

What the results mean

The scanner assigns risk levels based on what it finds.

Critical: Stock CDN URL match. Your website is loading images directly from a stock photo agency's servers. This is the strongest possible indicator. Action needed immediately.

High: Stock agency metadata or Vision API match. The image either contains embedded metadata from a stock agency or appears on stock agency websites when searched. Strong indicator that a license may be required.

Medium: Generic copyright metadata or widespread distribution. The image has copyright notices in its metadata or appears on many websites. This could be a stock photo, but it could also be a Creative Commons image with proper attribution requirements, or a photographer's original work.

No flags: Nothing detected. The scanner found no indicators. This does not mean the image is safe. It means none of the automated checks triggered. An image can be fully copyrighted with no metadata, hosted on your own server, and never indexed by Vision API. No scanner can guarantee an image is free to use.

False positives happen

The scanner is designed to over-report rather than under-report. A false positive costs you five minutes of checking your license records. A missed detection can cost you thousands of euros.

Common false positive scenarios:

You have a valid license. You purchased the image from Getty or Shutterstock, and the CDN or metadata match is from the legitimate source. The scanner can't know you have a license. It only detects the commercial agency origin.

Your designer bought the license. The image was properly purchased under your designer's account. The scanner flags the agency origin. You need to verify the license covers your specific use.

The image is from a free tier. Some stock agencies offer free images with restrictions. The scanner still flags the agency origin because automated checks can't distinguish between free and paid tiers.

In all cases, check your records, confirm the license exists, and move on.

What to do when the scanner flags an image

Step 1: Don't panic. A flag is not a lawsuit. It's an early warning.

Step 2: Identify the image. The scan report shows the URL and, for premium scans, the specific stock agency match. Find the image on your website.

Step 3: Check your license records. Search your stock photo account purchase history, check with your web designer, and look through any documentation from when the website was built.

Step 4: If you have a license, keep the proof. Document it. Screenshot your purchase receipt. Save the license agreement. If an enforcement agency contacts you later, you'll want this ready.

Step 5: If you don't have a license, replace the image. Remove it from your website and replace it with a properly licensed alternative. Our guide on safe free image sources lists services where you can find commercial-use images at no cost.

Step 6: Check the rest of your site. If one unlicensed image turned up, there are probably more. Scan your full website for copyrighted images rather than fixing them one at a time.

What the scanner can't do

Transparency about limitations matters more than false confidence.

It can't confirm you have a license. The scanner detects copyright indicators on the image side. It has no access to your purchase history or license agreements. Only you can verify the licensing status.

It can't check every image format. The scanner processes standard web images (JPEG, PNG, WebP). Images embedded in PDFs, videos or Flash content are not checked.

It won't catch every copyrighted image. An image hosted on your own server, with EXIF data stripped, that doesn't appear in Vision API's index, will pass through undetected. No tool can examine an arbitrary image and determine its copyright status with certainty.

It can't assess fair use or exceptions. Some uses of copyrighted images may be legal under specific exceptions. The scanner can't evaluate context-dependent legal questions.

What the scanner does offer is a systematic first pass that catches the most common and highest-risk scenarios. Stock CDN patterns and EXIF metadata account for the majority of cases where small businesses get hit with copyright claims. Finding those before Getty or CopyTrack does is the point.

Run your scan

If you haven't checked your website's images yet, now is a good time. Enforcement agencies don't send warnings before demand letters. The first you hear about it is when you owe money.

Run a free scan and see what your website looks like through the eyes of a copyright enforcement crawler.

Check your website now

Scan your website for Image Copyright issues and 30+ other checks.

Scan your site free