Context: Algorithms to find short paths on Cayley graphs of (finite) groups are of some interest - see below.
There can be several approaches to that task. One of ideas coming to my mind - in some sense that problem can be seen as "language translation" or "sequence to sequence" machine learning task (see details below), such ML-techniques are quite powerful (e.g. ChatGPT). My colleagues and me started to think in that direction, so not to "reinvent wheel" - here is a question:
Question: The approach (see details below) to find short path for Cayley graphs by "language translation" ("sequence to sequence methods") is discussed somewhere in the literature ? (I have not seen, but might miss something.) More generally can one foresee some potential problems ? Suggest the best transformer/whatever architecture, training scheme, other ML-tricks for the neural network ?
Brief idea: Consider for simplicity just the symmetric group $S_n$ for some $n$. Choose some generators $g_i$ (let me consider 3 generators "R","G","B"), construct the Cayley graph. Any sequence of generators (e.g. RRGGBBR[e]) applied to identity gives some element of the group. On the other hand - any element of $S_n$ is by definition a permutation of $(1,2,3,...,n)$.
Thus we have two "languages" to describe the elements of the group:
"Permutlandski" - element = sequence made of $(1,2,...n)$ - just a permutation of $(1,2,3,...,n)$
"Cayleylandski" - element = sequence of generators (e.g. "RGBRRGB" (applied to identity)). (Like for the natural languages one can describe same elements by different "words/sentences").
Main point: So the task - find the short path from given permutation $(i_1,i_2,...,i_n)$ to identity permutation $(1,2,3,...,n)$ is exactly the "translation" task from the "Permutlandski" to "Cayleylandski".
Exactly as in the language translation problem - the sequences might be of different lengths ( in "Cayleylandski" - from 0 do diameter) and there are different ways to express the SAME element (sentence) in "Cayleylandski" language. Although these are unpleasant issues for machine learning - nevertheless that is exactly how it happens in "sequence-to-sequence" tasks and standard methods are applicable.
Note: In the setup above - each choice of generators of a group - generates NEW "Cayleylandski" language. So it is a kind of family of languages. Each new generators choice requires new model to be trained.
Approach to solution:
Random walks - to create "train set" To apply a machine learning technique we should generate "training set" e.g. the pairs: sentence + its translation. And then train the model to correctly reproduce these examples. and then hope that it would correctly generalize to unseen examples.
Here generating the training set is quite easy - because "back translation" from "Cayleylandski" to "Permutlandski" is very straightforward - just multiply the permutations.
So we generate random sequences of generators (i.e. consider the random walks on a group) and thus we will have desired set of pairs: (sentence in "Cayleylandski" , sentence in "Permutlandski").
After that we can train a machine learning model to give good results on that training set. The most popular models are "Transformers".
Example: We have tried the application of transformer model for the particular choice of the generators of the $S_n$ - the most standard simple ones are the transpositions $(i,i+1)$. See the code by Nikita Buhal here https://www.kaggle.com/code/tttzof351/bubblesort-permutation . In that simple case it seem the model can work. Note: here the solution to the task is the standard "bubble sorting" algorithm - so we are training the neural to net to reinvent "bubble sorting".
Previous works: There are several approaches how to use machine learning to find short path on Cayley graphs - (what I know) mainly corresponds to the case of the Rubik's groups and related to Reinforcement learning or more classical machine learning, though can be extended to general groups. See e.g. list of references here: https://www.kaggle.com/datasets/alexandervc/growth-in-finite-groups/discussion/485599, some of my own (different idea) here: https://www.kaggle.com/competitions/santa-2023/discussion/472594 .
Applications of short path problem: sampling and using such algorithm allows to estimate growth and hence diameter (not more than 2*median-diameter (D.Zare MO)); in computational biology Cayley distance - corresponds to evolutionary distance, in parallel computing latency corresponds to diameter of clusters connection graph, the sorting problem and solving puzzles e.g. Rubik's cube are exactly the same task, see also Mark Sapir's MO, and so on.
For machine learning side - that problem can be seen as toy model for testing/benchmarking different neural net architectures. The problem is non trivial, but it is easy to generate as much train/validation sets as one wants. Generating large training sets for real data is always a headache, but here it is not.
PS
We are trying to organize something like a "Mini-Polymath" project on the intersection of the group theory and machine learning, applying machine learning methods to Cayley graph tasks. Everybody is welcome to join. The team workspace is Kaggle dataset: https://www.kaggle.com/datasets/alexandervc/growth-in-finite-groups. Team chat: https://t.me/sberlogasci . Some video records of the webinars can be found here: https://www.youtube.com/@SciBerloga/playlists . Today (18 April, 18.00 CET time) we plan to discuss these ideas. The zoom link will be available here https://t.me/sberlogabig 5 minutes before start (or just contact me). Some slides from previous talks.