Computer Vision and Machine Learning Lab (CVML)

About

The Computer Vision and Machine Learning (CVML) Laboratory, housed within the University at Albany's College of Nanotechnology, Science, and Engineering, is a leading research group specializing in computer vision, machine learning, and video analytics.

Our research is dedicated to advancing the mathematical foundations and algorithmic development of machine learning and deep learning, particularly for the analysis of visual imagery. Through interdisciplinary collaboration, we drive forward comprehensive initiatives, leveraging expertise from diverse domains.

Funding has been provided by the National Science Foundation (NSF), the Defense Advance Program Research Agency (DARPA) and the National Institute of Justice (NIJ), the Intelligence Advanced Research Projects Activity (IARPA), Inventec Corporation, and several UAlbany seed funds. We also appreciate the generous donations from NVIDIA.

Join the Team

Are you passionate about artificial intelligence (AI), computer vision and deep learning?

Our lab is looking for highly motivated, dedicated individuals to dive into cutting-edge research and contribute to groundbreaking advancements in these fields.

Please reach out to our Co-directors, Ming-Ching Chang and Xin Li, for more information on opportunities within our lab.

Projects

Visit the faculty pages of our Co-directors, Ming-Ching Chang and Xin Li, for a list of the lab's latest publications.

Rat Seizure Detection & Classification

The precise identification and categorization of seizure stages through observable behaviors is essential for understanding and diagnosing neurological disorders. Traditional studies in neurology rely heavily on manual observation, which is both time-intensive and requires considerable manpower.

Utilizing rodent models for epilepsy research offers a pathway to discovering more effective drugs or neurotherapy treatments. The preliminary screening of potential anti-epileptic drugs (AED) with these models necessitates extensive video surveillance, a process that is not only slow and subjective but also prone to mistakes.

This research aims to introduce an AI-powered method for the quantitative analysis of rodent behaviors, including the classification of epilepsy stages. By employing advanced deep learning and computer vision technologies, we created an automated system for detecting and recognizing seizures in animals.

A novel hybrid technique, merging model-based and data-driven approaches, was crafted to automate the classification of epilepsy stages from video footage.

This investigation validates the practicality of using video analysis for detecting and classifying seizure stages in rodent models of temporal lobe seizures, offering a consistent and quantitative way to study rodent behavior.

This could potentially aid further behavioral studies in animals, such as those related to depression and anxiety. In the future, we plan to include and share a public dataset with the research community.

Please see accordion below for detailed alternative text for this infographic.

Alternative Text for Rat Seizure Detection & Classification Infographic

The infographic has three panes.

The first pane includes the words, "Step 1: Rat Bounding Box Detection for each frame," along with a photo of a rat hooked up to a machine inside a small glass enclosure. A blue square surrounds the rat, with a notation of "Person: 78.9."

The second pane includes the words, "Step 2: Rat Skeleton Detection for each frame," along with the same photo of the rate but with different annotations. A green square surrounds the rat, with blue, green and orange lines and dots along its body.

The third pane includes the words, "Step 3: Rate Seizures Stage Recognition for whole video clip," along with a chart titled, "Modified Racine Scale - Seizure Intensity Stages."

The chart includes this information:

Stage 1 (No Seizures): Staring and mouth clonus or wet-dog shakes
Stage 2: Head nodding or neck jerks
Stage 3: Forelimb clonus
Stage 4: Rearing with forelimb clonus
Stage 5: Rearing and falling with forelimb clonus
Stage 6: Wild running/jumping
Stage 7: Wild running/jumping with by tonic/clonic seizure

Each stage is accompanied by a sketch of the rat experiencing the symptoms described. An arrow points to Stage 3, which is highlighted.

Stages 1 to 3 are designated partial seizures. Stages 4 to 7 are designated generalized seizures, with Stages 4 to 5 noted as forebrain and Stages 6 to 7 noted as brainstem.

Challenging Image Manipulation Detection (CIMD)

The ability to detect manipulation in multimedia data is vital in digital forensics. Existing Image Manipulation Detection (IMD) methods are mainly based on detecting anomalous features arisen from image editing or double compression artifacts.

All existing IMD techniques encounter challenges when it comes to detecting small tampered regions from a large image. Moreover, compression-based IMD approaches face difficulties in cases of double compression of identical quality factors.

To investigate the State-of-The-Art (SoTA) IMD methods in those challenging conditions, we introduce a new Challenging Image Manipulation Detection (CIMD) benchmark dataset, which consists of two subsets, for evaluating editing-based and compression-based IMD methods, respectively.

The dataset images were manually taken and tampered with high-quality annotations. In addition, we propose a new two-branch network model based on HRNet that can better detect both the image-editing and compression artifacts in those challenging conditions.

Extensive experiments on the CIMD benchmark show that our model significantly outperforms SoTA IMD methods on CIMD.

Read our research paper, A New Benchmark and Model for Challenging Image Manipulation Detection.

Skeleton-based Human Action Recognition

Human action recognition is an important but challenging problem. With advances in low-cost sensors and real-time joint coordinate estimation algorithms, reliable 3D skeleton-based action recognition is now feasible.

We are interested in using recursive neural network (RNN) models to solve skeleton based human activity recognition problem.

Skeleton-based Human Action Recognition demonstration, with three images showing a skeletal figure crouching, goggling and throwing over time.

UA-DETRAC Benchmark Dataset for Multi-Object Tracking

UA-DETRAC is a challenging real-world multi-object detection and multi-object tracking benchmark. The dataset consists of 10 hours of videos captured with a Cannon EOS 550D camera at 24 different locations at Beijing and Tianjin in China.

The videos are recorded at 25 frames per seconds (fps), with resolution of 960 x 540 pixels. There are more than 140 thousand frames in the UA-DETRAC dataset and 8,250 vehicles that are manually annotated, leading to a total of 1.21 million labeled bounding boxes of objects.

We also perform benchmark tests of state-of-the-art methods in object detection and multi-object tracking, together with evaluation metrics detailed in our work.