So if I'm understanding correctly, you're given a string (e.g. "Star Wars") and an on-screen keyboard accessible through an up-down-left-right controller. The goal is to be able to convert a given string into a sequence of UDLR commands that will enter that string through the keyboard.
What if you pre-calculated a map of the optimal path between any two keys? Store it in a hashmap where the key is "current_position+desired_position" (e.g. from "Star Wars", you would have entries for "S+t", "t+a", "a+r", etc.) and the value is a sequence of U+D+L+R indicating the cursor moves.
You could even get clever and store it as some kind of routing table, similar to what is used for internet routers. Think of each letter on the keyboard as a router that has a route to each of its neighbors in the UDLR directions. Then, inject a dictionary of strings into the network based on movie titles, actor names, etc. from each "endpoint" and let it learn the most efficient way to enter each of them. This way, when you go to internationalize or change keyboard layouts, you can re-create an efficient map automatically.