Publications
-
You Can’t Judge a Binary by Its Header: Data-Code Separation for Non-Standard ARM Binaries using Pseudo Labels
Hadjer Bankrouda, Nirav Diwan, Gang Wang
46th IEEE Symposium on Security and Privacy, 2025
TLDR: We propose a novel to separate data and code in non-standard ARM binaries using pseudo labels. Trained an LLM on 20 million+ instructions.
Paper -
Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning
Nirav Diwan, Tolga Ergen, Dongsub Shim, Honglak Lee
1st Workshop on Preparing Good Data for Generative AI, AAAI, 2025
TLDR: Fine-tuned model can identify high quality preference pairs for alignment, even when it may not know which response is better. Substantially increases alignment quality for Coding, Maths and Reasoning on MT-Bench.
Paper -
It Doesn't Look Like Anything to Me: Using Diffusion Model to Subvert Visual Phishing Detectors
Qingying Hao, Nirav Diwan, Ying Yuan, Giovanni Apruzzese, Mauro Conti, Gang Wang
33rd USENIX Security Symposium, 2024 (USENIX Security)
TLDR: Used Diffusion Models to attack online phishing detectors. Attack empirically validated for 100+ brands across both white-box anbd black-box settings.
Paper | Code -
Weakening the Inner strength - Spotting core collusive users in the YouTube Blackmarket
Hridoy Sankar Dutta*, Nirav Diwan*, Tanmoy Chakraborty
16th International AAAI Conference on Web and Social Media, 2022 (ICWSM)
TLDR: Investigated the collusive blackmarket on YouTube using graphs to identify the most influential blackmarket users.
Paper | Code -
Fingerprinting Finetuned Language Models in the Wild
Nirav Diwan, Tanmoy Chakraborty, Zubair Shafiq
59th Annual Meeting of the Association for Computational Linguistics (Findings), 2021 (ACL)
TLDR: Developed a LLM based classifer to fingerprint AI-generated text to the Fine-tuned Language Model for 100+ classes.
Paper | Code -
RecipeDB: A resource for exploring recipes
Devansh Batra, Nirav Diwan, Utkarsh Upadhyay, Jushaan Singh Kalra, Tript Sharma, Aman Kumar Sharma, Dheeraj Khanna, Jaspreet Singh Marwah, Srilakshmi Kalathil, Navjot Singh, Rudraksh Tuwani, Ganesh Bagler
Database: The Journal of Biological Databases and Curation (Oxford University Press), 2020
TLDR: Developed a worldwide database of recipes with over 100,000+ recipes from 50+ countries.
Paper -
A Named Entity Based Approach to Model Recipes
Nirav Diwan, Devansh Batra, Ganesh Bagler
3rd International Workshop on Data Engineering meets Intelligent Food & Cooking Recipes, 2020 (DECOR Workshop @ ICDE)
TLDR: Created an Information Retrieval (IR) Model to extract ingredient information from recipes.
Paper | Code -
Nutritional Profile Estimation in Cooking Recipes
Jushaan Kalra, Devansh Batra, Nirav Diwan, Ganesh Bagler
3rd International Workshop on Data Engineering meets Intelligent Food & Cooking Recipes, 2020 (DECOR Workshop @ ICDE)
TLDR: Developed a scalable method to estimate nutritional profiles of recipes using a reliable database.
Paper | Code