A group of robotics experts at Singapore’s Government Technology Agency (GovTech) is working on a robotics technology stack that is set to unify the fragmented landscape of software development tools used to write robot software.
That landscape is being dominated by the Robotic Operating System (ROS) framework from the open source community, Nvidia’s Isaac platform, and customised software development kits (SDKs) from robot makers such as Boston Dynamics.
Speaking to Computer Weekly in an exclusive interview, Chong Jiayi, a distinguished engineer at GovTech, said ROS is most often used by academia and hobbyists to build robotic applications – although it may not be suited for industrial-grade robots that require low-latency performance.
Nvidia’s Isaac, on the other hand, is built on some of the best features of ROS, such as a directed node graph structure to orchestrate and control robotic movements, to cater to production-quality robotic applications, said Chong.
GovTech’s robotics stack, written in the C++ programming language, was designed to be lightweight, platform neutral and is compatible with existing SDKs. “We do not discriminate, whether you use ROS, Isaac or a custom SDK,” said Chong.
For example, he said the robotic stack’s beyond-visual-line-of-sight (BVLOS) capability, which enables GovTech’s four-legged Boston Dynamics robot, called Spot, to be piloted remotely, plugs directly into Boston Dynamics’ SDK.
Spot was deployed in May 2020 with much public interest to help with safe distancing efforts at parks, gardens and nature reserves in Singapore, starting with a two-week trial at Bishan-Ang Mo Kio Park. Fitted with cameras, the robot will broadcast a recorded message to remind park visitors to observe safe distancing measures.
Chong’s team has since taken, and heavily modified, the MIT-licensed open source codes from ROS to build autonomous capabilities for Spot. “We’re doing a lot of deep learning, so the next phase of autonomy is to rely less on the old-school Slam-based approach and more on visual and perception-based autonomy,” said Chong.
Slam is the acronym for simultaneous localisation and mapping, which helps robots navigate unfamiliar areas by reconstructing the 3D environment around them. GovTech, however, is working on what is known as Visual Slam, which uses commodity cameras fitted on robots – rather than expensive and bulky Lidar laser-sensing technology – to map the environment.
Read more about AI in APAC
- Fujitsu has developed a computer vision model that recognises hand-washing gestures to ease the enforcement of stricter food handling rules in Japan.
- Alibaba is among a growing crop of technology companies that are rising to the challenge of solving the toughest problems in natural language processing.
- Amazon Web Services has developed a machine learning model to translate sign languages into text in a showcase of assistive technology.
- Using Amazon’s artificial intelligence smarts, National Australia Bank has developed a unique brand voice that will greet customers who dial up its contact centre.
By using a generic robotics stack, Chong said his team will be able to tap Nvidia’s Isaac and other robotic software platforms with deep learning capabilities to develop Visual Slam, which is crucial for a robot such as Spot to operate autonomously in unstructured environments.
“The GovTech robotics stack is not centred around any single framework,” he said. “We have specifically designed it to be as generic as possible, so that in the future, if something else comes up, we can still support it.”
Meanwhile, Chong’s team is continuing to refine the robotics stack and has already achieved breakthroughs that enable Spot to be used in areas where mobile coverage is not good enough to support real-time, synchronous communications between the robot and its human controllers.
“There were some places where we couldn’t even get a signal, but the robot recovered gracefully,” said Chong, adding that a lot of engineering work was put into ensuring quality-of-service so that the robot is smart enough to send data packets – including those of video feeds from its five onboard cameras – to human controllers remotely at the right time without overloading the network.