You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

340 lines
14 KiB
HTML

<!doctype html>
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<!-- begin SEO -->
<title>BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis - Yuchen Lei</title>
<meta property="og:locale" content="en-US">
<meta property="og:site_name" content="Yuchen Lei">
<meta property="og:title" content="BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis">
<link rel="canonical" href="https://github.com/pages/academicpages/academicpages.github.io/publication/01-2023-babd">
<meta property="og:url" content="https://github.com/pages/academicpages/academicpages.github.io/publication/01-2023-babd">
<meta property="og:description" content="Cryptocurrencies have dramatically increased adoption in mainstream applications in various fields such as financial and online services, however, there are still a few amounts of cryptocurrency transactions that involve illicit or criminal activities. It is essential to identify and monitor addresses associated with illegal behaviors to ensure the security and stability of the cryptocurrency ecosystem. In this paper, we propose a framework to build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We also propose a novel and efficient subgraph generation algorithm called BTC-SubGen to extract a k -hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node. We then conduct 13-class classification tasks on BABD-13 by five machine learning models namely k -nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost, the results show that the accuracy rates are between 93.24% and 97.13%. In addition, we study the relations and importance of the proposed features and analyze how they affect the effect of machine learning models. Finally, we conduct a preliminary analysis of the behavior patterns of different types of Bitcoin addresses using concrete features and find several meaningful and explainable modes.">
<meta property="og:type" content="article">
<meta property="article:published_time" content="2023-12-28T00:00:00-08:00">
<script type="application/ld+json">
{
"@context" : "http://schema.org",
"@type" : "Person",
"name" : "Yuchen Lei",
"url" : "https://github.com/pages/academicpages/academicpages.github.io",
"sameAs" : null
}
</script>
<!-- end SEO -->
<link href="/feed.xml" type="application/atom+xml" rel="alternate" title="Yuchen Lei Feed">
<!-- http://t.co/dKP3o1e -->
<meta name="HandheldFriendly" content="True">
<meta name="MobileOptimized" content="320">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<script>
document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
</script>
<!-- For all browsers -->
<link rel="stylesheet" href="/assets/css/main.css">
<meta http-equiv="cleartype" content="on">
<!-- start custom head snippets -->
<link rel="apple-touch-icon" sizes="57x57" href="/images/apple-touch-icon-57x57.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="60x60" href="/images/apple-touch-icon-60x60.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="72x72" href="/images/apple-touch-icon-72x72.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="76x76" href="/images/apple-touch-icon-76x76.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="114x114" href="/images/apple-touch-icon-114x114.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="120x120" href="/images/apple-touch-icon-120x120.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="144x144" href="/images/apple-touch-icon-144x144.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="152x152" href="/images/apple-touch-icon-152x152.png?v=M44lzPylqQ">
<link rel="apple-touch-icon" sizes="180x180" href="/images/apple-touch-icon-180x180.png?v=M44lzPylqQ">
<link rel="icon" type="image/png" href="/images/favicon-32x32.png?v=M44lzPylqQ" sizes="32x32">
<link rel="icon" type="image/png" href="/images/android-chrome-192x192.png?v=M44lzPylqQ" sizes="192x192">
<link rel="icon" type="image/png" href="/images/favicon-96x96.png?v=M44lzPylqQ" sizes="96x96">
<link rel="icon" type="image/png" href="/images/favicon-16x16.png?v=M44lzPylqQ" sizes="16x16">
<link rel="manifest" href="/images/manifest.json?v=M44lzPylqQ">
<link rel="mask-icon" href="/images/safari-pinned-tab.svg?v=M44lzPylqQ" color="#000000">
<link rel="shortcut icon" href="/images/favicon.ico?v=M44lzPylqQ">
<meta name="msapplication-TileColor" content="#000000">
<meta name="msapplication-TileImage" content="/images/mstile-144x144.png?v=M44lzPylqQ">
<meta name="msapplication-config" content="/images/browserconfig.xml?v=M44lzPylqQ">
<meta name="theme-color" content="#ffffff">
<link rel="stylesheet" href="/assets/css/academicons.css"/>
<script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { equationNumbers: { autoNumber: "all" } } }); </script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {
inlineMath: [ ['$','$'], ["\\(","\\)"] ],
processEscapes: true
}
});
</script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/latest.js?config=TeX-MML-AM_CHTML' async></script>
<!-- end custom head snippets -->
</head>
<body>
<!--[if lt IE 9]>
<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
<![endif]-->
<div class="masthead">
<div class="masthead__inner-wrap">
<div class="masthead__menu">
<nav id="site-nav" class="greedy-nav">
<button><div class="navicon"></div></button>
<ul class="visible-links">
<li class="masthead__menu-item masthead__menu-item--lg"><a href="/">Yuchen Lei</a></li>
<li class="masthead__menu-item"><a href="/publications/">Publications</a></li>
<li class="masthead__menu-item"><a href="/teaching/">Teaching</a></li>
</ul>
<ul class="hidden-links hidden"></ul>
</nav>
</div>
</div>
</div>
<div id="main" role="main">
<div class="sidebar sticky">
<div itemscope itemtype="http://schema.org/Person">
<div class="author__avatar">
<img src="/images/profile.png" class="author__avatar" alt="Yuchen Lei">
</div>
<div class="author__content">
<h3 class="author__name">Yuchen Lei</h3>
<p class="author__bio">MSc @ Wuhan University</p>
</div>
<div class="author__urls-wrapper">
<button class="btn btn--inverse">Follow</button>
<ul class="author__urls social-icons">
<li><i class="fa fa-fw fa-map-marker" aria-hidden="true"></i> Wuhan, Hubei, China</li>
<li><a href="https://www.researchgate.net/profile/Yuchen-Lei-8"><i class="fab fa-fw fa-researchgate" aria-hidden="true"></i> ResearchGate</a></li>
<li><a href="https://www.linkedin.com/in/https://www.linkedin.com/in/~yclei/"><i class="fab fa-fw fa-linkedin" aria-hidden="true"></i> LinkedIn</a></li>
<li><a href="https://github.com/TooYoungTooSimp"><i class="fab fa-fw fa-github" aria-hidden="true"></i> Github</a></li>
<li><a href="https://scholar.google.com/citations?user=sCVs-IUAAAAJ"><i class="fas fa-fw fa-graduation-cap"></i> Google Scholar</a></li>
<li><a href="https://orcid.org/0009-0005-4610-6550"><i class="ai ai-orcid-square ai-fw"></i> ORCID</a></li>
</ul>
</div>
</div>
</div>
<article class="page" itemscope itemtype="http://schema.org/CreativeWork">
<meta itemprop="headline" content="BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis">
<meta itemprop="description" content="Cryptocurrencies have dramatically increased adoption in mainstream applications in various fields such as financial and online services, however, there are still a few amounts of cryptocurrency transactions that involve illicit or criminal activities. It is essential to identify and monitor addresses associated with illegal behaviors to ensure the security and stability of the cryptocurrency ecosystem. In this paper, we propose a framework to build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We also propose a novel and efficient subgraph generation algorithm called BTC-SubGen to extract a k -hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node. We then conduct 13-class classification tasks on BABD-13 by five machine learning models namely k -nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost, the results show that the accuracy rates are between 93.24% and 97.13%. In addition, we study the relations and importance of the proposed features and analyze how they affect the effect of machine learning models. Finally, we conduct a preliminary analysis of the behavior patterns of different types of Bitcoin addresses using concrete features and find several meaningful and explainable modes.">
<meta itemprop="datePublished" content="December 28, 2023">
<div class="page__inner-wrap">
<header>
<h1 class="page__title" itemprop="headline">BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis
</h1>
<p>Published in <i>IEEE Transactions on Information Forensics and Security</i>, 2023 </p>
<p>Recommended citation: "BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis," in IEEE Transactions on Information Forensics and Security, vol. 19, pp. 2171-2185, 2024, doi: 10.1109/TIFS.2023.3347894. <a href="https://ieeexplore.ieee.org/document/10375557/"><u>https://ieeexplore.ieee.org/document/10375557/</u></a></p>
</header>
<section class="page__content" itemprop="text">
<p>Cryptocurrencies have dramatically increased adoption in mainstream applications in various fields such as financial and online services, however, there are still a few amounts of cryptocurrency transactions that involve illicit or criminal activities. It is essential to identify and monitor addresses associated with illegal behaviors to ensure the security and stability of the cryptocurrency ecosystem. In this paper, we propose a framework to build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We also propose a novel and efficient subgraph generation algorithm called BTC-SubGen to extract a k -hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node. We then conduct 13-class classification tasks on BABD-13 by five machine learning models namely k -nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost, the results show that the accuracy rates are between 93.24% and 97.13%. In addition, we study the relations and importance of the proposed features and analyze how they affect the effect of machine learning models. Finally, we conduct a preliminary analysis of the behavior patterns of different types of Bitcoin addresses using concrete features and find several meaningful and explainable modes.</p>
</section>
<footer class="page__meta">
</footer>
<section class="page__share">
<h4 class="page__share-title">Share on</h4>
<a href="https://twitter.com/intent/tweet?text=/publication/01-2023-babd" class="btn btn--twitter" title="Share on Twitter"><i class="fab fa-twitter" aria-hidden="true"></i><span> Twitter</span></a>
<a href="https://www.facebook.com/sharer/sharer.php?u=/publication/01-2023-babd" class="btn btn--facebook" title="Share on Facebook"><i class="fab fa-facebook" aria-hidden="true"></i><span> Facebook</span></a>
<a href="https://www.linkedin.com/shareArticle?mini=true&url=/publication/01-2023-babd" class="btn btn--linkedin" title="Share on LinkedIn"><i class="fab fa-linkedin" aria-hidden="true"></i><span> LinkedIn</span></a>
</section>
<nav class="pagination">
<a href="#" class="pagination--pager disabled">Previous</a>
<a href="/publication/02-2024-matd3" class="pagination--pager" title="Multi-Agent Reinforcement Learning for Cooperative Task Offloading in Internet-of-Vehicles
">Next</a>
</nav>
</div>
</article>
</div>
<script src="/assets/js/main.min.js"></script>
</body>
</html>