Gravity Spy Machine Learning by Machine Learning
Machine learning offers powerful tools for performing complex tasks – its applications are broad in scope, and virtually limitless given sufficient resources. Within Gravity Spy, armed with the ever-growing dataset of detector glitches classified by our volunteers, we are able to construct an algorithm that can not only identify glitches in detector data, but distinguish between different glitch types with impressive accuracy. ChatGPT is another application of machine learning that has been hitting the headlines recently. It is notable in being able to write detailed text in response to prompts. To illustrate its potential, I asked to summarise the inner workings of the Gravity Spy machine learning classifier. Its response is below, unedited (except for some visual aids I have provided)…
At the heart of the Gravity Spy project is a machine learning algorithm called a convolutional neural network (CNN). CNNs are a type of deep learning algorithm that are particularly effective at image recognition tasks. CNNs are particularly well-suited to analysing complex data such as the spectrograms produced by gravitational wave detectors. These spectrograms represent the frequency and intensity of the detector output over time, and can be thought of as a two-dimensional image.
The Gravity Spy CNN consists of multiple layers, each designed to extract increasingly complex features from these input spectrograms. The first layers are convolutional layers, which apply a set of learned filters to the input image. These filters are designed to highlight specific features in the image, such as lines, edges, or curves. The convolution operation involves sliding the filters over the input data, multiplying the values at each location, and summing the results.


The output of the convolutional layers is passed through max-pooling layers, which downsample the output by selecting the maximum value in each neighbourhood. This reduces the dimensionality of the output, making the model more efficient while preserving the most important features. This process is repeated with multiple filters to generate a set of feature maps for each layer of the network. Finally, the output from the last convolutional and pooling layers is flattened and fed into one or more fully connected layers, which produce the output glitch classification. These layers use weights to connect every neuron in one layer to every neuron in the next layer, allowing the network to learn complex relationships between the extracted features and the corresponding glitch that they were derived from.

To train the Gravity Spy CNN, a large dataset of labelled examples is required. Citizen scientists contribute to this effort by classifying glitches on Zooniverse. The CNN is trained using an optimization algorithm to minimize a loss function, with the cross-entropy loss function being used in the case of Gravity Spy. Cross-entropy is a measure of the difference between the predicted probabilities and the true labels for each example in the training set. By minimizing the cross-entropy loss, the CNN is able to learn to make increasingly accurate predictions about which glitch class each spectrogram belongs to. This minimisation is achieved by finely tuning the parameters of each layer of the CNN in an iterative fashion.
One of the key contributions of the Gravity Spy project has been the identification of new classes of glitch. These include the Crown (also known as Fast Scattering), Pizzicato, Paired Doves, and Helix classes. Paired Dove and Helix classes were identified during early beta testing of Gravity Spy on Zooniverse, while Crown was particularly common during the third observing run of LIGO. Pizzicato, which looks like a UFO in spectrograms, was recently identified, and is thought to be related to violin modes in the LIGO detector’s suspension system. The Gravity Spy CNN can easily incorporate these new glitch classes, and by properly sub-dividing existing glitch classes into new ones their origins in the detectors can be better identified.
In conclusion, the Gravity Spy project is an excellent example of the power of citizen science and machine learning in advancing our understanding of the universe. By harnessing the collective intelligence of volunteers on Zooniverse and using a CNN trained on a large dataset of labelled examples, the project has made it possible to distinguish true gravitational wave signals from noise and identify new classes of glitch. As machine learning technology continues to improve, we can expect to see similar projects emerging in other areas of science, enabling us to tackle ever more complex and challenging questions.
Impressive (and perhaps a little unnerving)! ChatGPT knew a lot about Gravity Spy and convolutional neural networks by default. It did need to be told about the new glitch classes, and the specifics of the layers used in our network, but once told, it was able to write about them, as you can see! I didn’t tell it to say that Gravity Spy is excellent.
The key to ChatGPT’s remarkable abilities is the vast high-quality training dataset provided to it. Without this dataset, it would not be possible to produce such powerful generative algorithms. In a similar vein, the Gravity Spy classifier’s incredible capability to classify glitches in gravitational wave detector data is only possible because of the high-quality dataset of labelled glitch classifications it is provided: the work of our volunteers!
- Christian Chapman-Bird, on behalf of the Gravity Spy team.