Natural Language Understanding
Begin your transformation to secure and compliant communication. Share your information so we can connect.
Download PDF
Natural Language Understanding (NLU) is a subset of natural language processing (NLP), which focuses on the interpretation and inference of human language. While NLP deals with the broader task of processing and analyzing natural language data, NLU hones in on understanding the meaning and context behind spoken or written language. This understanding is crucial in deciphering the intent, sentiment, and nuanced meanings embedded in complex language structures.
NLU achieves this through sophisticated algorithms and machine learning models that analyze various aspects of language, such as syntax (sentence structure), semantics (meaning), and pragmatics (context). It goes beyond the mere recognition of words to grasp the deeper layers of communication, such as irony, sarcasm, and implied meanings, which are often challenging even for humans.
One key application of NLU is in conversational AI, where it enables chatbots and virtual assistants to interpret and respond to user queries accurately. But in the context of cybersecurity, NLU plays a pivotal role in detecting and mitigating sophisticated threats like social engineering attacks. By understanding the subtleties of language, NLU systems can identify potentially malicious intents in communications that might otherwise bypass traditional security measures.
With more than 347.3 billion emails sent and received every day, email has become highly susceptible to business email compromise (BEC). Defending against BEC is critical for the SOC or SIOC professional. Today, solutions for email security are available from many vendors. Reports further reveal that the global BEC market grew to $1.1 billion in 2022, with expectations of it reaching $2.8 billion when 2027 rolls in.
But this approach to securing traditional email is becoming a smaller part of today’s threat environment. CSOs and CISOs must now defend against social engineering attacks, insider threats, ransomware, policy violations, and more – delivered not only via email, but through popular cloud workplace channels such as M365, Slack, Salesforce, LinkedIn, Zoom, WhatsApp, Telegram, and more. At least 45% of business communications now take place over cloud channels, and attackers are rapidly deploying social engineering and language-based attacks. These types of attacks can’t be discovered by existing email security solutions.
Integrated cloud communications security (ICCS) offers CSOs and CISOs visibility across the growing number of cloud workplace channels. But understanding the context and intent of messages is key to responding quickly and stopping risks earlier in the kill chain.
SafeGuard Cyber uses patented Natural Language Understanding (NLU) technology to quickly and accurately identify and defend against social engineering attacks, insider threats, ransomware, and policy violations. NLU can understand the context and intent of communications in cloud communication channels – both inbound and outbound. By evaluating attributes such as lexical features, spelling features, and topical features, NLU can determine the likelihood that the source message is a social engineering attack.
With deep visibility into 30 channels of business communications and the ability to understand more than 52 languages, SafeGuard Cyber correlates events across channels, allowing security organizations to have full visibility across all of their cloud workspace channels.
-
Cybersecurity attacks using digital communications are rampant. No matter how many layers of security solutions are implemented, organizations and their networks remain vulnerable.
-
With the increasing use of cloud apps to do business, these popular channels are increasing the overall attack surface, and attacks are becoming more focused on social engineering.
-
Social engineering attacks in these channels can evade traditional cybersecurity detection methods. Attacks come in typical forms of human communications, so there are no links or files to investigate.
-
Many organizations don’t have the visibility to prevent attacks taking place over cloud communications channels. They may be using solutions to protect email communications, but these solutions can’t extend to the burgeoning cloud workspace.
To help CSOs and CISOs bring expert security to the cloud workspace, SafeGuard Cyber deploys a patented Natural Language Understanding to analyze, identify and prevent social engineering attacks across more than 30 cloud communications channels, protecting organizations from business communication compromise (BCC).
w
w
The progress of machine learning-powered solutions has been nothing but astronomical. From information extractors to text translators, Natural language processing (NLP) and understanding have evolved dramatically. Now, with tools like ChatGPT, people can generate content that looks, feels, and reads “human”, expediting processes and providing advantages to its users.
But even before the advent of ChatGPT, SafeGuard Cyber has been wielding AI and NLU capabilities to fight digital threats online – from phishing attacks to ransomware, to insider threats.
SafeGuard Cyber security combines Natural Language Understanding and AI machine learning to understand the human elements of context and intent in cloud communications. Language-based elements in a conversation, including lexical features, spelling, and topical elements are automatically analyzed and evaluated against models to identify social engineering attacks. Now, with visibility across channels and NLU analysis of context and intent, organizations can respond faster and stop attacks earlier in the kill chain.
Natural Language Understanding processes communications in three phases:
-
The first phase is pre-processing and text preparation. In this phase, the input message goes through initial attributes extraction and token extraction. The text preparation portion of the first phase of processing begins by detecting what language the source message uses. This is important to ensure that text tokenization is done in the appropriate manner. For example, languages like Japanese do not use spacing the way western languages do and must be processed appropriately. This requirement applies to all of the NLP processing steps.
-
In the second phase, features are extracted from the original message or the message tokens and passed to a trained machine model. Once the language has been detected, a text preparer begins text tokenization. The text preparer receives the input text and breaks it up into logical tokens that are to be evaluated in the feature extraction phase. The text preparer can include programming for updating the text message to include part of speech (POS) tags for each word in the text. This includes labels like noun, proper noun, verb, adverb, etc. In addition, the text preparer removes unnecessary words from the text, such as “is”, “the”, “a” and others that are not needed for future action.
-
In the third phase, all features are extracted from the message and scored using the model. The output of the model is a final risk assessment indicating if the analyzed message is potentially a social engineering attack, and such risk assessment can be followed by appropriate remedial action.
After evaluation is completed, a lexical feature extractor assesses the attributes list generated in the pre-processing phase to determine whether any elements within these attributes may be a sign of attack. For example, the inclusion of a URL that has an IP address instead of a domain name is not normal in business communications.
The use of an unusually long URL in the message is often seen as a strategy to mask a suspicious domain or subdomain of the URL. A lexical feature extractor determines if possible fake domains are in use. For instance, “facebok.com” or “amazn.com” may be employed in an attack.
In these cases, the domain in use is extremely close to facebook.com and amazon.com. Most users will not notice the difference and consider them safe. The results of these components can be combined into a vector that will be combined with other text features and passed to the model evaluator for analysis.
A spelling feature extractor receives the list of cleaned tokens and analyzes them for spelling errors. For example, the spelling feature extractor can count the number of misspelled words in a message or document and then generate a normalized metric for this count based on the length of the message. This normalized misspelled word count can be used in conjunction with other extracted features by the model. Also, the spelling feature extractor outputs a ratio of spelling mistakes for further processing.
Then, the topics feature extractor evaluates the list for common topics included in social engineering attack messages, detecting questionable attributes such as:
where the message recipient is being pressured to do something
where there is some sort of gift or unexpected award offered in the message
where there is pressure to keep the communication private
where the message recipient is being asked to verify or change their password or credentials
where an attacker is trying to get the recipient to believe that there is something wrong with their account or that they have been attacked
where the intent is to get the message recipient to make a payment on a fake invoice
where the sender attempts to impersonate an individual with authority
where the message asks the recipient to do something questionable
The final phase of the process is model analysis. Here, the combined features vector is passed to a pre-trained model evaluator that predicts the overall risk score of the received message. The model evaluator outputs a risk score, calculated as a value between zero and one. Any score above a 0.5 is considered a possible social engineering attack.
SafeGuard Cyber’s own Natural Language Understanding and AI-powered platform performs these tasks effectively and efficiently. As a result, our platform can effectively stop social engineering and phishing attacks in their tracks.
Here are some real-world examples, lifted from the SGC platform:
Figure 1. An Indian phishing scam flagged by the SafeGuard Cyber platform
Aside from using controversial stories to pique interest and attract attention, social engineers and phishers often use scams and false promises to convince people to either give out money or information, as evidenced by the example below:
Figure 2. A UK Visa scam email flagged by the SafeGuard Cyber platform.
These tie into the various use cases of Natural Language Understanding and AI to protect industries from:
Moreover, NLU benefits companies by addressing compliance requirements for mobile messaging, CRM free text, and social selling, especially for highly-regulated industries like life science and financial services companies.
SafeGuard Cyber’s NLU benefits enterprises by operating in a variety of environments. One example environment is an electronic mail system where hundreds, thousands or even millions of e-mail messages are received every day. Emails may be sent to one or more employees of the organization, where the email contains a sender email address, a subject matter description and the body copy. The address, subject matter description and email body copy may be considered an electronic message or source messaging, which includes message metadata.
SafeGuard Cyber NLU can process hundreds, thousands or even millions of messages in near real-time. Messages that are flagged as meeting a risk threshold are kept from immediate transmission or processed in an effort to thwart or minimize the perceived risk. In this step, only messages that are not perceived to be risky are passed to the intended recipients. Source messages that are considered a risk can be evaluated further and deleted, transmitted to proper authorities or otherwise acted upon as deemed appropriate.
How NLU determines context and intent to detect attacks across channels and language
There are significant disruptions taking place in the overall messaging environment. First, organizations are moving away from Microsoft Outlook email servers to use applications in the cloud, such as Microsoft 365.
Second, attacks are changing; no one is sending just malicious links any more. Attacks have become more targeted, more social engineering focused, and attacks have broadened their scope to target everyone in an organization. This means that file analysis and link analysis is only marginally useful.
The email market is so massive, that everyone who needs it has some level of email security. But many of these legacy solutions are based on an older generation of technology and have become unable to detect attacks that take advantage of multiple communications channels.
By deploying NLU, SafeGuard Cyber can prevent attacks that reach beyond email. Using APIs instead of agents, we are able to evaluate incoming and outgoing communications in cloud workspaces such as Slack, Teams and chat. This is a high-level differentiator. But the differentiation in this space is beyond just detections.
Many solutions perform metadata analysis, but all they are doing is checking the domain to see if it has a strong reputation; how old it is; or when it was created. A lot of companies’ solutions connect to Microsoft 365, but they do not deploy machine learning and are not complete solutions.
To identify risk in communication channels, SafeGuard Cyber builds a profile of activities – a digital profile – that includes many factors that determine behavior. For example, you don’t send emails this time of day; you don’t use email signatures; you don’t misspell; you don’t send emails from India. The database that SafeGuard Cyber uses draws on contains thousands of behavioral features, including how people communicate with each other.
Through analysis of behavioral features, NLP can determine when normal activity is happening or, for example, when someone is spoofing you. Using core ML models, threat analysis capabilities, vision analysis, OCR, and content evaluation, SafeGuard Cyber helps you avoid cybersecurity attacks such as CEO impersonation, email compromise, payroll fraud, account takeover, and more.
The SafeGuard Cyber detection engine is flexible because attacks are always changing. Today, you may send customers a thousand emails, but only one is opened. On the other hand, WhatsApp has more than a 90% open rate in most countries – nine times out of ten, if you send a message through WhatsApp, it will be opened. This is an example of where cloud communications is headed. With the Natural Language Understanding capabilities of SafeGuard Cyber, organizations can embrace new communication channels and digital transformation, and do it responsibly and securely.
When it comes to ransomware, avoiding becoming a victim is better than cure. Reducing the risk of ransomware incidents should be a priority for many businesses. However, should an organization be unfortunate enough and fall prey to ransomware, the following steps should be followed:
-
Remove The Device From The Network.
Ransomware on one device is bad, but ransomware proliferating through a network of devices is catastrophic. Employees should be trained to immediately disconnect their device from the network if they see a ransomware demand displayed on their screen. They should also do the same if they observe anything peculiar, such as an inability to access their own files. Employees must not attempt to restart the device; it should be sent immediately to the IT department. -
Notify Law Enforcement.
Ransomware is a crime. Theft and extortion rolled into one make it a law enforcement concern. Organizations should all default to immediately contacting the police cybercrime department, should they fall victim to a ransomware attack. -
Use Digital Risk Protection to Establish The Scope of Attack.
In the wake of a ransomware attack, security teams need to gather as much intelligence as they can, as fast as they can. This will help both internal IT teams and law enforcement agencies formulate a response. Enterprises should strive to figure out the nature of the attack: who is behind it, what tools they used, who they targeted and why. Answering such questions can help your IT managers and network administrators figure out the extent of the attack and protect networks from future attacks. -
Consult with Stakeholders to Develop the Proper Response.
Enterprises suffering a bad ransomware attack need to answer a host of questions: Can they afford to lose access to the targeted files, either because they have been backed up, or because they are not of the highest priority? Can the organization afford the ransom? Is there any room for negotiation? All stakeholders, from shareholders to legal counsel, should be consulted. -
Get the Post-Mortem Right.
The best way to resist a ransomware threat is to have learnt from the last one. After an attack, enterprises should task their IT technicians, network administrators, and cybersecurity teams with a thorough review of the breach. A meticulous assessment of an organization's infrastructure, practices, and processes is required to discover flaws in security, and reinforce an enterprise against existing and future threats.
Fortunately, more companies are becoming smart enough to not give in to the threat of ransomware. As of Q4 of 2020, the average ransom payment is down by 34% ($154,108) from $233,817 in 2020’s Q3.
The dramatic decline can be attributed to the recent instances of malware attacks where, instead of being deleted, the stolen data is released publicly, even when the affected organization or individual pays. Now, more victims of cyber extortion are saying “no” to ransom payments, and are becoming smarter in their cybersecurity efforts by creating backups of their data and following best practices.
Hopefully, moving forward, more companies will proactively secure their data by following the best practices stated above and continue to resist being strong-armed by ransomware attackers. When cyber extortion loses its profitability, organizations win.
With proper communication risk protection, organizations can detect and nullify ransomware threats before they become an issue. The SafeGuard Cyber platform can keep pace with the scale and velocity of modern digital communications, and detect phishing links and other indicators of ransomware attacks across the full suite of cloud applications. Threats are instantly flagged and quarantined before an unsuspecting human target clicks on anything dangerous.
Secure Human Connections
Ready to see how SafeGuard Cyber secures modern communication apps wherever they exist?
Expert Insights on Cloud App Risks
Stay up-to-date on the latest social engineering, insider threats, and ransomware vulnerabilities.