Page 30 - EE Times Europe Magazine – June 2024
P. 30

30 EE|Times EUROPE



         AUTONOMOUS VEHICLES | ARTIFICIAL INTELLIGENCE
        Overcoming Unbalanced Training Data for


        Safer Autonomous Driving

        By Pat Brans

               ccording to a Sweden-based expert   run out in front of a vehicle, for example.  how to apply this concept to autonomous
               on autonomous systems, self-driving   “One strategy for dealing with underrepre-  vehicles. We have made some first attempts,
               cars won’t be available for at least   sented data is to augment the recorded data   where we demonstrated the feasibility, but
        Aanother 10 years—and part of the   with slight changes, such as geometric or   there’s still a lot of research to be done.
        holdup is data.                     chromatic variations, [made]                   Some people argue that large
          “Many of the remaining challenges stand-  directly in the training data,”        language models and large
        ing in the way of fully autonomous vehicles   Felsberg said. “But I would go       multimodal models will be
        have to do with the quality of data used to   so far as to say that it’s better    a big help in manipulating
        train the neural networks that control a   to adjust the bias of the clas-         internal representations, and
        vehicle,” said Michael Felsberg, full professor   sifiers that result from the     we are certainly considering
        and head of the Computer Vision Laboratory   training than to apply fake           that option.”
        at Sweden’s Linköping University. To ensure   data to the training process.”
        that AVs react appropriately to real-world   Some people have                      CORRECTING
        road conditions and events, researchers are   floated the idea of using            UNBALANCED DATASETS
        working on ways to fill the gaps in training   generative AI to produce            Aside from filling in the gaps
        data and correct for biases in the datasets, he   supplemental training data       in training data, a remaining
        told EE Times Europe.               for scenarios that are not                     challenge lies in minimizing
          Felsberg serves as a member of the    close enough to the real data   Wallenberg AI’s Michael   biases—or, more accurately,
        Wallenberg AI, Autonomous Systems and   to be represented by slight   Felsberg     adjusting the biases in ways
        Software Program (WASP) executive commit-  manipulations. Felsberg                 that produce desired out-
        tee, representing Linköping University. He also   thinks this approach would be catastrophic,   comes. “Some cases are much more common
        collaborates with industrial players, including   however, given the propensity of generative   than others, and you would like to have
        car and truck manufacturers and companies   AI to create absurd representations of the real   control over the bias induced by this effect,”
        that produce support systems for vehicles.  world. When unrealistic data is used to train   Felsberg said. “That requires methods and
          Recording the data needed for training   an autonomous system, the resulting network   tools for adjusting biases in a trained model.”
        is expensive—and so is labeling it, largely   becomes unpredictable.      When training data is collected for pedestri-
        because the task still requires human inter-  A better method, according to Felsberg,   ans, for example, children or wheelchair users
        vention. According to Felsberg, to bring down   would be to move toward explainable AI—  might be insufficiently accounted for because
        costs, the role of humans in the labeling   internal representations that reflect real   they are encountered less frequently than
        process must be minimized through some   scenarios by combining expert models (con-  pedestrians in other categories. To ensure that
        form of weakly supervised learning whereby   structed from what human experts think are   an AV responds appropriately to all
        labels are assigned automatically or at least   real-world scenarios) with machine-learned   pedestrians—as any reasonable person would
        semi-automatically. But while many academic   models (constructed from data collected in   expect it to do—the system has to overcome
        researchers and industrial players have been   the field). Instead of altering the data used to   skews in the data that might otherwise lead to
        experimenting with weakly supervised learn-  train the model, a highly skilled technician   skews in object recognition. “It’s impossible
        ing, thus far, none of the methods are ready   could analyze and possibly alter the internal   to have fully balanced datasets for whatever
        for widescale use.                  representation directly as part of a new step in   you want to do,” Felsberg said. “But you can
          The monetary cost of collecting and label-  the training process.     measure the unbalance and have your system
        ing data is not the only issue. An even bigger   “If you understand the scenarios and have   adjust accordingly.”
        obstacle to reliable self-driving cars has to   the right tools to manipulate the model’s   Felsberg has been asserting that self-
        do with ensuring that the data used to train   internal representations of external objects,   driving cars are at least 10 years out ever since
        the neural networks will get the vehicle to do   you have a more powerful way of influenc-  he began working on the technology in 2007,
        the right things. Not only is it impossible for   ing the outcome,” Felsberg said. “You could   and it remains his prediction today. But when
        AV manufacturers to collect enough data to   place cars in slightly different positions [than   the day does come for AVs to be sold or rented
        cover all conceivable situations, but the data   in] the real data or make them drive in a   to the general public, manufacturers should be
        they do collect is likely to include biases that,   different direction than they did in the real   required to demonstrate that their self-driving
        if left uncorrected, can produce undesired   data, to model situations that never occurred   cars have overcome potential biases that result
        behavior.                           in the dataset. Prototypes of this kind of   from unbalanced data, he said.
                                            system already exist, and we are starting to   Felsberg proposes amending Euro NCAP—a
        COMPENSATING FOR                    participate in that kind of research with our   safety rating system to help consumers
        UNDERREPRESENTED SCENARIOS          industrial partner Zenseact.”       select cars based on their reactions to a set
        Many of the most dangerous driving situations   Felsberg continued, “The combination of   of real-life accident scenarios—to include
        involve circumstances so rare that they are   model-based knowledge and data-driven   tests designed to do just that. “Those changes
        unlikely to be fully represented in real-world   knowledge is a hybrid learning approach that   should be made now,” he said. ■
        training data. No AV developer can expect to   is very popular in many domains where pre-
        record enough real-world images and videos to   dictions are required—for example, making   Pat Brans is a contributing writer for EE
        cover all the different ways a pedestrian might   climate predictions. But we don’t yet know   Times Europe.

        JUNE 2024 | www.eetimes.eu
   25   26   27   28   29   30   31   32   33   34   35