Apple was most definitely going for semantics:
"The embedding network represents images as real-valued vectors and ensures that perceptually and
semantically similar images have close descriptors in the sense of angular distance or cosine similarity.
Perceptually and semantically different images have descriptors farther apart, which results in larger
angular distances."
The difference is not that clear-cut. The system extracts features from the image, and based on these features a neural network produces an image descriptor. If this descriptor is sufficiently similar to the image descriptor of a known CSAM image, the image will be flagged. Now yes, I understand that this type of system relies on existing images and is not capable of finding entirely new types of CSAM. But NCMEC was to provide its five million image hashes, that is a lot of images for a subject matter, and if you then go for similarity matching rather than exact matching, you have for all intents and purposes a CSAM classifier.
Training matters.
Semantic classifiers are given many examples of cats, trained to semantically identify cats in general and to distinguish them from other things, and then are asked to infer from a new image if it is a cat. Typically it is trained on multiple classes (cat, dog, house, car) and upon inference returns a confidence against each class. This is probably how "safe search" image filters are implemented-- trained to find stuff that looks like a general definition of porn.
That is not how the Apple NeuralHash is trained. It is trained to detect a specific image under perceptually invariant transforms, not a class of related but distinct images. It is not detecting CSAM, it is detecting specific instances of CSAM. It is trained to distinguish from examples that are not those instances of CSAM to prevent it from, for an over simplified example, declaring that an image with a lot of flesh tone is CSAM.
They are not perfectly reversible, that is true. But you can recreate a recognizable approximation of the original image. See:
https://towardsdatascience.com/black-box-attacks-on-perceptual-image-hashes-with-gans-cc1be11f277
This has been shown to work with PhotoDNA, a perceptual image hashing system widely used for CSAM detection. Maybe Apple's system would have been immune to this, but they spent quite some effort to prevent people from even trying, so I have my doubts.
No, you can't.
Here's an input image from that link:
Here's the image "reversed" from the Domino's hash:
That doesn't look like anything I'd be worried about.
A few things: first your link doesn't use PhotoDNA, it uses the Python imagehash library which has not been designed to resist any attacks such as you're discussing.
Second, it's using a GAN to create another image that matches that hash, but not necessarily the image that created the hash. Specifically it's saying "create an image that is a face and matches this hash". Since the first examples are all faces filling the frame on the input, and the GAN is creating a face filling the frame, it's easy to freak out and say "it's reversible!". It's also easy to say "they even got the hair color right!", because the hash being used is the ahash which is based on the average color of a region.
The example I shared above is when the image is arbitrary as it would be in your photo library and you ask the GAN to "create an image that is a face and matches the hash". What you get is nothing like the input. You could do the same with a GAN trained to make pictures of cars that match a hash, or landscapes that match a hash-- all would look like something, but nothing like what the original image was. Then imagine the image was 8MP rather than 0.044MP and the variation that would lead to. This is an example of what I meant by "spoofing". Generating an image that matches the hash, but looks nothing like the image that created the hash.