During a Twitter discussion Noel O'Boyle introduced me to Graph Edit Distance (GDE) as a useful measure of molecular similarity. The advantages over other approaches such as Tanimoto similarity is discussed in these slides by Roger Sayle.
It turns out Networkx can compute this, so it's relatively easy to interface with RDKit and the implementation is shown below.
Unfortunately, the time required for computing GDE increases exponentially with molecule size, so this implementation is not really of practical use.
Sayle's slides discusses one solution to this, but it's far from trivial to implement. If you know of other open source implementations, please let me know.
Update: GitHub page
This work is licensed under a Creative Commons Attribution 4.0
It turns out Networkx can compute this, so it's relatively easy to interface with RDKit and the implementation is shown below.
Unfortunately, the time required for computing GDE increases exponentially with molecule size, so this implementation is not really of practical use.
Sayle's slides discusses one solution to this, but it's far from trivial to implement. If you know of other open source implementations, please let me know.
Update: GitHub page
This work is licensed under a Creative Commons Attribution 4.0