The Columbia University IDS lab is conducting a unique study to help understand user behavior on host based systems. The study is part of the RUU project. The study is entirely voluntary, and presents a unique opportunity for anyone with a host computer to help our research efforts. The IRB for this study is available here.
In this study our objective is evaluate recently developed modeling techniques for masquerade attack detection. The masquerade attack is a class of attacks, in which a user of a system illegitimately poses as, or assumes the identity of another legitimate user. Identity theft in financial transaction systems is perhaps the best known example. This problem remains one of the more important research areas requiring new insights in order to mitigate this threat.
Prior research focused on developing novel algorithms that can effectively identify suspicious behaviors that may lead to the identification of imposters. In this study, we do not focus on whether an access by some user is authorized or not: We assume that the masquerader does not attempt to escalate the privileges of the stolen identity. Rather, the masquerader simply accesses whatever the victim can access. However, we conjecture that the masquerader is unlikely to know how the victim behaves when using a system. We rely on this key assumption in order to detect a masquerader, and focus on monitoring a user's behavior in real time to determine whether recent commands issued by a user are consistent with the user's historical behavior.
In order to prove the conjecture above, we need to collect data to model the baseline normal behavior of individual computer users and we need to test and evaluate how well the user models perform in detecting significant changes in behavior that may indicate a misuse of computer privileges.
One example is to model how users typically search their computer file system, and to use these models to detect when unusual searches are conducted to perhaps warn of a possible misuse by someone who is not authorized to use the specific machine. The collected data set will be anonymized (so no one can see who you are) and sanitized before being shared with the research community for research purposes.
In order to collect this data set, we have developed a host-based sensor for both Linux and Windows systems.
The Linux sensor gathers system commands issued by the volunteer during normal use of the computer. It is a lightweight hook into the audit daemon system library which collects information about the running system. The information collected includes the name of the process, path, command line parameters, the process id, and system level calls. The user can define what terms will be removed (through a built in filter) and can review the collected data at any time. In addition a copy of the data can be archived on the local system so a user can study their own long term usage patterns. When ready, the sensor will ask permission to upload the anonymized data to one of our IDS lab servers. The data upload frequency can be customized by the user using the sensor GUI.
The privacy filter included with the sensor is there to help end users restrict the type of information to be collected and shared. For example, by default all local user names on the machine are replaced by userX. The user is free to define their own string replacement rules to allow the system to mask private data.
If you would like to participate, please follow the instructions below:
Download and Installation Instructions
- V17 Updated: Dec 10...fixed die after 15 due to miscode logic
Click here to download the Linux sensor.
- Untar the latest sensor to any directory using "tar -xvf sensor-name.tar"
- Switch to the directory where you have untarred the downloaded file.
- Run the ./install.pl script to install the sensor.
- Note: You can uninstall/re-install the sensor using the provided scripts.
- To set the data upload frequency, start the sensor via the installed link on your desktop, and click on the "Upload Options" button.
- Select data upload frequency (hourly, daily or weekly).
- There is an option to save a copy of any uploaded data after an upload is completed. By default all uploaded data is deleted from the local system.
- Note: You also have the option of initiating a data upload by clicking on the "Force Upload" button from the sensor's GUI at any time.
Windows SensorThe windows sensor allows the user to gather process and user behavior by monitoring process registry behavior and user window touches.
The latest version will be uploaded here. Previous versions should not be used and are no longer supported.
The install MSI script allows the sensor to be installed easily. By default the program is installed to
c:\program files\Columbia University\RUU-sensor\. After running the install script, some shortcuts will be installed in the program groups.
Note: (just in case) There is a convenient shortcut to uninstall the sensor (or use the control panel uninstall feature).
If you have any further questions about the study, please contact us.
If you have problems with the sensor installation please contact shlomo.
Explanation of audit data format:
- Linux Sensor:
- Data is collected by default to /var/sensor/upload.
- If the user specifies the data should be left after upload, it will be copied to /var/sensor/olddata
- The file name will be something .gz (gunzip to move to txt)
- The audit data collected will be in the following tab separated file:
- Taxonomy (category we use for each process, see /var/sensor/taxonamy.txt file)
- Process name
- Full path to process
- Current working directory of process
- command line arguments to process
- Process ID
- Owner/User ID
- System Call type (we are only monitoring open calls (type 5))
- Numeric system call arguments
- Windows Sensor:
- Data is collected by default to 'data' directory ( c:\Program files\Columbia University\RUU-sensor\ )
- If the user specifies in the config file that the data should be left after upload, it will be copied to 'doneupload' directory.
- We collect three types of records; registry actions, process execution, and window touches. Registry actions are open/close/update to specific registry keys by running programs. There might be thousands of actions per second. We filter background noise, by counting high occuring touches and not reporting them more than once within a given window frame (2 minutes). The window touches include either clicking/switching between windows, or a title update on the window.
- The filename will be something .gz (gunzip to move to txt)
- The audit data collected will be in the following tab separated format:
For registry actions:
- Process Executable name
- Type of audit action (open/close/update/etc)
- Result of action (SUCCESS/FAILURE/BUFOVRFLOW)
- Return Value of action
- Windows file structure timestamp (nanoseconds since 1601)
- Hash of username/system
- Plain text date time
- Process ID
For Window Touches:
- Process Executable name
- Type of audit action (UserTouch_0/UserTouch_1) (title vs switch)
- Path of process
- Title of window touched
- Process ID <= Parent PID (for sub windows) some counting flags (how many total touches, how many for this application)
- file struct timestamp
- Hash of username/system
- plain text date time
- Process ID
- Click Here for sample code to convert the windows timestamp to unix/plain text.
This project is a collaborative effort funded by the I3P organization. The I3P Human Behavior, Insider Threat and Awareness project is joint with 6 other universities and research organizations; funding is provided under contract from the Department of Homeland Security. Further detail about the I3P can be found HERE .