ATD 360

ATD 360

ATD 360

A Cyber Security Web Application

A Cyber Security Web Application

A Cyber Security Web Application

Context

Context

ATD is a cybersecurity application that, at its core, functions similarly to a log system, but offers much more. It is installed on all endpoints within an organization and diligently tracks events. Its primary function is to search for known threat patterns across all events and generate an alarm with a specific severity level when something matches the predefined security rules. As one of the product designers on the team, my role involved enhancing the experience of security analysts (our end-users) in their primary task of threat hunting.

ATD is a cybersecurity application that, at its core, functions similarly to a log system, but offers much more. It is installed on all endpoints within an organization and diligently tracks events. Its primary function is to search for known threat patterns across all events and generate an alarm with a specific severity level when something matches the predefined security rules. As one of the product designers on the team, my role involved enhancing the experience of security analysts (our end-users) in their primary task of threat hunting.

Organization’s Events List

New connection established

User ‘A’ has logged in

‘A’ has uploaded a file

‘A’ has created a new user

the new user is deleting files

Step 1

Step 1

Pattern Detected

Pattern Detected

Step 2

Step 2

An Alarm is Generated

An Alarm is Generated

Security Analyst manually checks alarms in a daily basis

Security Analyst manually checks alarms in a daily basis

Step 3

Step 3

The Appropriate Reaction is taken

The Appropriate Reaction is taken

About Graph (Optional)

About Graph (Optional)

About Graph (Optional)

About Graph (Optional)

Roles & Responsibilities

Roles & Responsibilities

Starting My Journey in Graph

Starting My Journey in Graph

Starting My Journey in Graph

During First Phases

During First Phases

During First Phases

ATD360 Platform

ATD360 Platform

ATD360 Platform

Interaction Designer

User Researcher

Interaction Designer

User Researcher

Interaction Designer

User Researcher

Product Designer

User Researcher

Design System Designer

Product Designer

User Researcher

Design System Designer

Product Designer

User Researcher

Design System Designer

Product Design Lead

User Researcher

Product Design Lead

User Researcher

Product Design Lead

User Researcher

Business Problem

Business Problem

"Our customers are leaving because they find the product difficult to use. Also, when our analysts leave their company, others struggle to use the product without their help."

"Our customers are leaving because they find the product difficult to use. Also, when our analysts leave their company, others struggle to use the product without their help."

"Our customers are leaving because they find the product difficult to use. Also, when our analysts leave their company, others struggle to use the product without their help."

Data Gathering

Data Gathering

There were several questions I had at the beginning of starting this project.

There were several questions I had at the beginning of starting this project.

I was thinking...

How are the security analysts working?

I was thinking...

How are the security analysts working?

I was thinking...

What do they mean by "Difficult"? How can I measure it?

I was thinking...

What do they mean by "Difficult"? How can I measure it?

I was thinking...

Who are our users?

I was thinking...

Who are our users?

I was thinking...

In what scale are we loosing customers?

I was thinking...

In what scale are we loosing customers?

I was thinking...

What is making the product difficult to use?

I was thinking...

What is making the product difficult to use?

I created a research plan and started gathering data.

I created a research plan and started gathering data.

Step 0: Stakeholder Interview

I wanted to know on what scale we are losing customers and how is it happening. Are they just removing the software? leaving the contract? calling us and saying we're not going to extend our license?

Step 0: Stakeholder Interview

I wanted to know on what scale we are losing customers and how is it happening. Are they just removing the software? leaving the contract? calling us and saying we're not going to extend our license?

Step 0: Stakeholder Interview

I wanted to know on what scale we are losing customers and how is it happening. Are they just removing the software? leaving the contract? calling us and saying we're not going to extend our license?

25%

of customers didn't re-new their license after

3 Months

Step 1: Contextual Inquiry

I was totally new to the domain of cybersecurity and had no idea why our users were experiencing difficulties with the software. So, the first step I took was simply observing them while they worked.

6 Participants, 12 Attack Scenario

Step 1: Contextual Inquiry

I was totally new to the domain of cybersecurity and had no idea why our users were experiencing difficulties with the software. So, the first step I took was simply observing them while they worked.

6 Participants, 12 Attack Scenario

Step 1: Contextual Inquiry

I was totally new to the domain of cybersecurity and had no idea why our users were experiencing difficulties with the software. So, the first step I took was simply observing them while they worked.

6 Participants, 12 Attack Scenario

Step 1: Contextual Inquiry

I was totally new to the domain of cybersecurity and had no idea why our users were experiencing difficulties with the software. So, the first step I took was simply observing them while they worked.

6 Participants, 12 Attack Scenario

Alarms List

Alarms List

Before ATD360

Before ATD360

A Known Work Flow

A Known Work Flow

Before ATD360

Before ATD360

1

Alarms List Page

Checking the list of alarms to find a pattern

2

Alarms List Page

Opening many alarms in different tabs to explore the details

3

Alarm Detail Page

Switching between them and memorizing(or taking notes) about the details

4

Alarm Detail Page

Potentially finding a pattern

5

Linux

Manually connect to identified systems via terminal and execute scripts to respond to threats

Step 2: Deep Interview

After reviewing their daily workflow and learning about every step of their tasks, I began interviewing participants to understand what pain points exist.

8 Participants

Step 2: Deep Interview

After reviewing their daily workflow and learning about every step of their tasks, I began interviewing participants to understand what pain points exist.

8 Participants

Step 2: Deep Interview

After reviewing their daily workflow and learning about every step of their tasks, I began interviewing participants to understand what pain points exist.

8 Participants

Step 2: Deep Interview

After reviewing their daily workflow and learning about every step of their tasks, I began interviewing participants to understand what pain points exist.

8 Participants

It's so hard to memorize and keep the data in your mind while switching between different tabs.

I can't react fast even when I find the threat. there was a time I found a hacking connection and it took 2 mins to close it.

Finding patterns between hundreds of alarms every day and I can't hit my daily goal with all of these alarms

It happens a lot during the day that I miss a specific alarm and I get the whole pattern wrong.

I can't detect a long planned attack. I have to check even previous alarms in the month to find if this new alarm is related to that one.

I get confused easily with all these tabs open. I just separate them in different windows.

It's so hard to find the pattern in one look. Even I with several years of experinece get confused with so many alarms

Bad days occur when a real attack happens. We are buried under thousands of alarms and can’t react quickly

I know what I need to accomplish, but I always have to complete numerous small tasks to get my job done

Try checking a malicious file yourself using the software, and you’ll understand what I mean by ‘pain’.

I’ve attempted to use numerous scripts, which I wrote myself, to automatically identify patterns that are familiar to me.

It's so hard to memorize and keep the data in your mind while switching between different tabs.

I can't react fast even when I find the threat. there was a time I found a hacking connection and it took 2 mins to close it.

Finding patterns between hundreds of alarms every day and I can't hit my daily goal with all of these alarms

It happens a lot during the day that I miss a specific alarm and I get the whole pattern wrong.

I can't detect a long planned attack. I have to check even previous alarms in the month to find if this new alarm is related to that one.

I get confused easily with all these tabs open. I just separate them in different windows.

It's so hard to find the pattern in one look. Even I with several years of experinece get confused with so many alarms

Bad days occur when a real attack happens. We are buried under thousands of alarms and can’t react quickly

I know what I need to accomplish, but I always have to complete numerous small tasks to get my job done

Try checking a malicious file yourself using the software, and you’ll understand what I mean by ‘pain’.

I’ve attempted to use numerous scripts, which I wrote myself, to automatically identify patterns that are familiar to me.

Pain Points (I'll focus on 2 of them in this case study)

Pain Points (I'll focus on 2 of them in this case study)

I can't hit my daily target KPI (Detected Attack Scenarios/Day)

I can't hit my daily target KPI (Detected Attack Scenarios/Day)

Finding pattern between lots of alarms is difficult.

Finding pattern between lots of alarms is difficult.

After detection, we can't respond quickly.

After detection, we can't respond quickly.

We can't divide tasks between team members.

We can't divide tasks between team members.

There are too many alarms.

There are too many alarms.

Switching between products is difficult.

Switching between products is difficult.

Final Problem Statement

Final Problem Statement

We are losing customers because the security analysts can't hit the organization preferred daily KPI .They can't hit their daily KPI because:

We are losing customers because the security analysts can't hit the organization preferred daily KPI .They can't hit their daily KPI because:

1

In-App Data Flow isn’t working correctly. Users have to switch between multiple tabs to validate just one threat scenario.

In-App Data Flow isn’t working correctly. Users have to switch between multiple tabs to validate just one threat scenario.

2

Responding to a recent attack requires manual scripting and coding across various applications to connect to infected systems and do the appropriate tasks like isolation, file removing, and etc.

Responding to a recent attack requires manual scripting and coding across various applications to connect to infected systems and do the appropriate tasks like isolation, file removing, and etc.

Metrics

Metrics

We've lost 25% of our customers in 3 months.

Success Metric

Ads Active Vendors

(Assuming this new feature can bring new vendors to the ads system)

Success Metric

Banner Sales Average Handling Time

Success Metric

AVG Count of Calls for Converting a Vendor

Adoption Metric

Count of Converted Vendors / Count of Submitted Requests

Adoption Metric

Count of Purchased Banners

Success Metric

Ads Active Vendors

(Assuming this new feature can bring new vendors to the ads system)

Success Metric

Banner Sales Average Handling Time

Success Metric

AVG Count of Calls for Converting a Vendor

Adoption Metric

Count of Converted Vendors / Count of Submitted Requests

Adoption Metric

Count of Purchased Banners

Success Metric

Ads Active Vendors

(Assuming this new feature can bring new vendors to the ads system)

Success Metric

Banner Sales Average Handling Time

Success Metric

AVG Count of Calls for Converting a Vendor

Adoption Metric

Count of Converted Vendors / Count of Submitted Requests

Adoption Metric

Count of Purchased Banners

Success Metric

Ads Active Vendors

(Assuming this new feature can bring new vendors to the ads system)

Success Metric

Banner Sales Average Handling Time

Success Metric

AVG Count of Calls for Converting a Vendor

Adoption Metric

Count of Converted Vendors / Count of Submitted Requests

Adoption Metric

Count of Purchased Banners

Ideation

Ideation

1

In-App Data Flow isn’t working correctly. Users have to switch between multiple tabs to validate just one threat scenario.

In-App Data Flow isn’t working correctly. Users have to switch between multiple tabs to validate just one threat scenario.

The issue originated from the way we presented data. During interviews, I discovered that we were functioning more like a log system, rather than a system that accurately represents network activities. Benchmarks echoed the same sentiment. Presenting event data in a ‘List’ format is not an efficient method.

The issue originated from the way we presented data. During interviews, I discovered that we were functioning more like a log system, rather than a system that accurately represents network activities. Benchmarks echoed the same sentiment. Presenting event data in a ‘List’ format is not an efficient method.

Data Presentation Ideas (I had many ideas, but these three were the most feasible)

Data Presentation Ideas (I had many ideas, but these three were the most feasible)

I aimed to automatically identify the connections between alarms using an algorithm, and display the related alarms and their relationships in a single view.

I aimed to automatically identify the connections between alarms using an algorithm, and display the related alarms and their relationships in a single view.

I selected this one because:

I selected this one because:

  • The other methods were displaying an excessive amount of data, thereby utilizing a significant amount of space.

  • This one is more visual.

  • Security analysts are so familiar with a graph visualization.

  • It required less technical effort.

  • It organizes alarms based on "Hosts" (where the analysts re-actions are primarily focused on)

  • The other methods were displaying an excessive amount of data, thereby utilizing a significant amount of space.

  • This one is more visual.

  • Security analysts are so familiar with a graph visualization.

  • It required less technical effort.

  • It organizes alarms based on "Hosts" (where the analysts re-actions are primarily focused on)

  • The other methods were displaying an excessive amount of data, thereby utilizing a significant amount of space.

  • This one is more visual.

  • Security analysts are so familiar with a graph visualization.

  • It required less technical effort.

  • It organizes alarms based on "Hosts" (where the analysts re-actions are primarily focused on)

  • The other methods were displaying an excessive amount of data, thereby utilizing a significant amount of space.

  • This one is more visual.

  • Security analysts are so familiar with a graph visualization.

  • It required less technical effort.

  • It organizes alarms based on "Hosts" (where the analysts re-actions are primarily focused on)

In this manner, a page with hundreds of alarms is transformed into incident pages that present the relationships between alarms using a graph.

In this manner, a page with hundreds of alarms is transformed into incident pages that present the relationships between alarms using a graph.

Alarms List

Alarms List

Before

Before

Alarms List

Alarms List

After

After

An incident

A file has been downloaded and it created a connection to a banned IP, then a registry file was changed by that file.

A malicous file has been downloaded

Risk: High

Time: 23 Hours ago

Desc: …

Admin user created a new user in of...

Risk: Medium

Time: 2 Hours ago

Desc: …

A connection was created to a ban...

Risk: High

Time: 2 Mins ago

Desc: …

Register file has been edited

Risk: High

Time: 23 Hours ago

Desc: …

2

Responding to a recent attack requires manual scripting and coding across various applications to connect to infected systems and do the appropriate tasks like isolation, file removing, and etc.

Responding to a recent attack requires manual scripting and coding across various applications to connect to infected systems and do the appropriate tasks like isolation, file removing, and etc.

After defining an attack scenario, security analysts needed to respond. With the new form of data visualization, I had numerous opportunities to provide quick access for taking actions directly on a host (since graph nodes represent hosts). This was the point at which I began to examine all the necessary actions that each alarm element would require.

After defining an attack scenario, security analysts needed to respond. With the new form of data visualization, I had numerous opportunities to provide quick access for taking actions directly on a host (since graph nodes represent hosts). This was the point at which I began to examine all the necessary actions that each alarm element would require.

Taking action required background information to assist the security analyst in making better decisions.

Taking action required background information to assist the security analyst in making better decisions.

Security analysts had many questions to decide what actions to take.

Security analysts had many questions to decide what actions to take.

Who downloaded it?

Who downloaded it?

Where was the first place we found this file?

Where was the first place we found this file?

Which user created this connection?

Which user created this connection?

Which processes were run by this file?

Which processes were run by this file?

Which hosts were infected by this file?

Which hosts were infected by this file?

When this file was seen?

When this file was seen?

What's the size of file?

What's the size of file?

Solution

Profiles: For every action displayed in the graph, a profile will be provided. Each profile consists of two main parts: ‘Information’ and ‘Actions’.

Profiles: For every action displayed in the graph, a profile will be provided. Each profile consists of two main parts: ‘Information’ and ‘Actions’.

Outcome

Outcome

We talked with marketing team and we had an usability test again after 3 months.

We talked with marketing team and we had an usability test again after 3 months.

Primary Metric

Primary Metric

Primary Metric

Customers Churn Rate

Customers Churn Rate

Customers Churn Rate

-23%

-23%

-23%

Primary Metric

Primary Metric

Primary Metric

AVG Detected Attack Scenarios / Day

AVG Detected Attack Scenarios / Day

AVG Detected Attack Scenarios / Day

+37%

+37%

+37%

Secondary Metric

Secondary Metric

Secondary Metric

AVG Error Rate

AVG Error Rate

AVG Error Rate

-17%

-17%

-17%

Next Steps

Next Steps

After redesigning the platform and acquiring new customers, we developed new features to streamline and expedite the workflow of the analysts. There were additional metrics that we wanted to concentrate on, and new features were being designed to optimize them.

After redesigning the platform and acquiring new customers, we developed new features to streamline and expedite the workflow of the analysts. There were additional metrics that we wanted to concentrate on, and new features were being designed to optimize them.

Playbooks

Playbooks

A feature to run multiple commands on multiple hosts at the same time.

A feature to run multiple commands on multiple hosts at the same time.

Auto Response

Auto Response

Define actions for recognized threats, and those commands will be executed automatically.

Define actions for recognized threats, and those commands will be executed automatically.

Challenges &Lessons

Challenges &Lessons

Challenges

1

1

Tech-Based Company

Tech-Based Company

Tech-Based Company

I had to convince the development team that an automatic version of data extraction was necessary. This was not an easy task, as it required a significant shift in their mindset and approach.

I had to convince the development team that an automatic version of data extraction was necessary. This was not an easy task, as it required a significant shift in their mindset and approach.

I had to convince the development team that an automatic version of data extraction was necessary. This was not an easy task, as it required a significant shift in their mindset and approach.

2

2

Small Team

Small Team

Small Team

There was a lot to do and in early phases we didn't have enough designers to do the tasks.

There was a lot to do and in early phases we didn't have enough designers to do the tasks.

There was a lot to do and in early phases we didn't have enough designers to do the tasks.

3

3

Complex Domain

Complex Domain

Complex Domain

I had to learn a lot about the security domain. Reading and understanding security-related material was a difficult yet rewarding experience.

I had to learn a lot about the security domain. Reading and understanding security-related material was a difficult yet rewarding experience.

I had to learn a lot about the security domain. Reading and understanding security-related material was a difficult yet rewarding experience.

Lessons

1

1

Customize Design Process

Customize Design Process

Customize Design Process

There were numerous phases during which we had to adjust our process to meet the requirements within a tight deadline. I experimented with various customized processes to find the one that best suited our team.

There were numerous phases during which we had to adjust our process to meet the requirements within a tight deadline. I experimented with various customized processes to find the one that best suited our team.

There were numerous phases during which we had to adjust our process to meet the requirements within a tight deadline. I experimented with various customized processes to find the one that best suited our team.

2

2

Collaboration in the Squad

Collaboration in the Squad

Collaboration in the Squad

I often had to make decisions in challenging situations and choose the right solution quickly. There was a time when I didn’t have sufficient data to substantiate something, and team collaboration was crucial in finding the best answer.

I often had to make decisions in challenging situations and choose the right solution quickly. There was a time when I didn’t have sufficient data to substantiate something, and team collaboration was crucial in finding the best answer.

I often had to make decisions in challenging situations and choose the right solution quickly. There was a time when I didn’t have sufficient data to substantiate something, and team collaboration was crucial in finding the best answer.