TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data

Anisha Islam and Abram Hindle

2025/01/12

TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data

Authors

Anisha Islam and Abram Hindle

Venue

Abstract

Pure Data (PD) is a visual programming language for computer music that allows users to create applications through a graph-based, drag-and-drop interface, using objects and connections to manage program flow. There is a lack of tool support for computer musicians using PD, particularly for code completion. In this paper, we introduce TriGraph, a graph-based probabilistic model specifically designed for code completion in PD. TriGraph uses statistical analysis of 2-node and 3-node subgraph frequencies to predict nodes and connections in PD graphs. Using a dataset of parsed PD files, we train and evaluate 5 TriGraph models, assessing their performance in predicting nodes and edges in PD graphs. Our evaluations indicate that the models achieve an average Mean Reciprocal Rank (MRR) score of 0.39 for node prediction, placing the correct answer within the top 3 suggestions, and outperforming the n-gram based KenLM model on similar tasks. For edge prediction, the models achieve an average MRR score of 0.57, with results showing that incorporating both 2-node and 3-node subgraphs yields better results than using only 3-node subgraphs. These findings suggest that TriGraph could enhance the productivity of PD programmers by providing code completion support that may speed up development, reduce errors, and assist in discovering available options. These potential benefits highlight its promise as a valuable support tool for end-user programmers in graphical environments.

Bibtex

@inproceedings{islam2025MSR-trigraph,
 abstract = {Pure Data (PD) is a visual programming language for computer music that allows users to create applications through a graph-based, drag-and-drop interface, using objects and connections to manage program flow. There is a lack of tool support for computer musicians using PD, particularly for code completion. In this paper, we introduce TriGraph, a graph-based probabilistic model specifically designed for code completion in PD. TriGraph uses statistical analysis of 2-node and 3-node subgraph frequencies to predict nodes and connections in PD graphs. Using a dataset of parsed PD files, we train and evaluate 5 TriGraph models, assessing their performance in predicting nodes and edges in PD graphs. Our evaluations indicate that the models achieve an average Mean Reciprocal Rank (MRR) score of 0.39 for node prediction, placing the correct answer within the top 3 suggestions, and outperforming the n-gram based KenLM model on similar tasks. For edge prediction, the models achieve an average MRR score of 0.57, with results showing that incorporating both 2-node and 3-node subgraphs yields better results than using only 3-node subgraphs. These findings suggest that TriGraph could enhance the productivity of PD programmers by providing code completion support that may speed up development, reduce errors, and assist in discovering available options. These potential benefits highlight its promise as a valuable support tool for end-user programmers in graphical environments.},
 accepted = {2025-01-12},
 author = {Anisha Islam and Abram Hindle},
 authors = {Anisha Islam and Abram Hindle},
 booktitle = {Proceedings of the 22nd International Conference on Mining Software Repositories},
 code = {islam2025MSR-trigraph},
 date = {2025-04-29},
 doi = {},
 funding = {NSERC Discovery Grant},
 location = {Ottawa, Canada},
 pagerange = {},
 pages = {},
 rate = {29%},
 role = {Co-Author},
 title = {TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data},
 type = {inproceedings},
 url = {http://softwareprocess.ca/pubs/islam2025MSR-trigraph.pdf},
 venue = {Proceedings of the 22nd International Conference on Mining Software Repositories},
 year = {2025}
}