Berkeley DB Java版性能测试
初衷
最近有很多朋友问到关于BDB等相关的一些性能测试数据,我想性能测试的结果受很多因素的影响,比如:你的程序设计,多线程/并发,测试数据集,测试平台等等。
一个简单的性能测试程序
在我之前的blog: 在Berkeley DB Java版中实现SQL查询,我提到了一下自己写的一个单线程的例子程序在9秒内读取了100万条记录,22秒内插入100万的记录。
我想,我可以在此和大家分享一下我的程序。当然,程序是我花了半天的时间开发的,仅供参考,不代表官方申明。
数据库记录格式 – DPLEntity.java
import com.sleepycat.persist.model.Entity;
import com.sleepycat.persist.model.KeyField;
import com.sleepycat.persist.model.Persistent;
import com.sleepycat.persist.model.PrimaryKey;
/**
* 仅供参考,不作为官方申明!
*
* @author chao
*
*/
public class DPLEntity {
}
@Persistent
class CompositeKey {
@KeyField(1)
int f0 = 0;
@KeyField(2)
String f1 = "The quick brown fox jumps over the lazy dog.";
CompositeKey() { } // for bindings
CompositeKey(int f0) {
this.f0 = f0;
}
CompositeKey(int f0, String f1) {
this.f0 = f0;
this.f1 = f1;
}
@Override
public String toString() {
return "CompositeKey: (" + f0 + "," + f1 + ")";
}
}
@Entity
class BasicEntity {
@PrimaryKey
CompositeKey key;
protected long id = 0;
protected String one = "one";
protected double two = .2d;
protected String three = "three";
Address address = new Address();
BasicEntity() { }
BasicEntity(int i) {
this.key = new CompositeKey(i);
}
public void modify() {
id++;
one += "1";
two = id;
three += "3";
address = new Address("Shenzhen", "Guangdong, China", 500001);
}
@Override
public String toString() {
return "BasicEntity: (" + key + "," + id + "," + one + "," +
two + "," + three + ", " + address + ")";
}
}
@Persistent
class Address {
private String city = "Boston";
private String state = "Massachusetts";
private int zip = 10001;
Address() { }
Address(String city, String state, int zip) {
this.city = city;
this.state = state;
this.zip = zip;
}
@Override
public String toString() {
return "Address: (" + city + "," + state + "," + zip + ")";
}
}
多线程 – MyThread.java
import com.sleepycat.je.CursorConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.Transaction;
import com.sleepycat.persist.EntityCursor;
import com.sleepycat.persist.EntityStore;
import com.sleepycat.persist.PrimaryIndex;
/**
* 仅供参考,不作为官方申明!
*
* @author chao
*
*/
public class MyThread extends Thread {
/* Global settings */
protected static final int EXIT_SUCCESS = 0;
protected static final int EXIT_FAILURE = 1;
/* Thread id */
protected int id;
/* Start time */
private long start;
/* End time */
private long end;
/* Num ops required */
private int numRequiredOps;
/* Num txns completed */
private int numTxns;
/* Num operations executed */
private int numExecutedOps;
/* Num deadlocks seen, results in retries */
private int numDeadlocks;
/* Entitystore */
private EntityStore store;
/* Primary index */
private PrimaryIndex primaryIndex;
/* Runtime configurations */
private MyExample.TestConfig testConfig;
private int keyNum;
public MyThread(int id,
EntityStore store,
PrimaryIndex primaryIndex,
MyExample.TestConfig testConfig,
int numRequiredOps) {
this.id = id;
this.store = store;
this.primaryIndex = primaryIndex;
this.testConfig = testConfig;
this.numRequiredOps = numRequiredOps;
}
public void run() {
System.out.println("Access method thread: " + id + " started.");
/* Allow the other threads to start. */
yield();
/* Record our start time. */
start = System.currentTimeMillis();
numTxns = 0;
int orderedKey = 0;
try {
numExecutedOps = 0;
boolean done = false;
orderedKey = id;
while (!done) {
Object txn = null;
try {
/* Begin the transaction. */
if (true) {
txn = beginTransaction();
numTxns++;
}
/* Perform the operations. */
boolean execOk = true;
int itemsPerTxn = testConfig.itemsPerTxn;
for (int j = 0; j < itemsPerTxn; j++) {
selectKey(orderedKey);
if (testConfig.operationType.equalsIgnoreCase("READ")) {
/* Read */
databaseGet(txn);
} else if (testConfig.operationType.
equalsIgnoreCase("DELETE")) {
/* Delete */
databaseDelete(txn);
} else if (testConfig.operationType.
equalsIgnoreCase("INSERT")) {
/* Insert */
databasePutNoOverwrite(txn);
} else if (testConfig.operationType.
equalsIgnoreCase("UPDATE")) {
/* Update */
databaseUpdate(txn);
} else {
/* Cursor scan with dirty read. */
scan();
}
if (execOk) {
numExecutedOps++;
orderedKey += testConfig.numThreads;
}
/* See if we're done. */
if (numExecutedOps >= numRequiredOps) {
done = true;
}
if (done) {
break;
}
}
/* Commit the transaction. */
if (txn != null) {
commitTxn(txn);
}
if (done) {
break;
}
} catch (Exception e) {
/* Deal with deadlock and other errors. */
if (txn != null) {
try {
abortTxn(txn);
} catch (Exception e2) {
System.err.println("abort: " + e2);
System.err.println("original exception: " + e);
System.exit(EXIT_FAILURE);
}
}
exitIfNotDeadlock(e);
/* else will retry on next iteration */
numDeadlocks++;
}
}
} catch (Exception e) {
e.printStackTrace();
System.exit(-1);
}
/* Record our end time. */
end = System.currentTimeMillis();
System.out.println("Access method thread " + id +
" exiting cleanly. Executed " + numRequiredOps + " " +
testConfig.operationType + " ops in " +
(end - start) + " ms.");
}
/**
* Select the key. It may be chosen in one of four ways:
* - random value
* - value within a range
* - value tied to the current iteration
* - a single, present value.
*
*/
private void selectKey(int iterationVal) {
int val = iterationVal;
assignKey(val);
}
private void assignKey(int val) {
keyNum = val;
}
private Object beginTransaction()
throws Exception {
try {
Transaction txn =
store.getEnvironment().beginTransaction(null, null);
return txn;
} catch (DatabaseException DE) {
System.out.println("Caught " + DE + " during beginTransaction");
return null;
}
}
private void abortTxn(Object txn)
throws Exception {
Transaction t = (Transaction) txn;
try {
t.abort();
} catch (DatabaseException DE) {
System.out.println("Caught " + DE + " during abort");
}
}
private void commitTxn(Object txn)
throws Exception {
Transaction t = (Transaction) txn;
try {
if (testConfig.syncCommit) {
t.commitSync();
} else {
t.commitNoSync();
}
} catch (DatabaseException DE) {
System.out.println("Caught " + DE + " during abort");
}
}
private void exitIfNotDeadlock(Exception e) {
System.err.println("unexpected exception: " + e);
e.printStackTrace();
System.exit(EXIT_FAILURE);
}
private void databasePutNoOverwrite(Object txn)
throws Exception {
BasicEntity entity = new BasicEntity(keyNum);
primaryIndex.putNoReturn((Transaction) txn, entity);
}
private void databaseGet(Object txn) throws Exception {
BasicEntity e = primaryIndex.get((Transaction) txn,
new CompositeKey(keyNum),
LockMode.DEFAULT);
// System.out.println("Read entity = " + e);
}
private void databaseUpdate(Object txn) throws Exception {
Transaction t = (Transaction) txn;
BasicEntity entity = primaryIndex.get(t, new CompositeKey(keyNum),
LockMode.DEFAULT);
entity.modify();
primaryIndex.putNoReturn((Transaction) txn, entity);
}
private int scan() throws Exception {
int numRecords = 0;
CursorConfig curConf = new CursorConfig();
if (true) {
curConf.setReadUncommitted(true);
System.out.println("setting dirty read");
}
Object txn = null;
if (true) {
txn = beginTransaction();
}
EntityCursor cursor =
primaryIndex.entities((Transaction) txn, curConf);
try {
for (BasicEntity e : cursor) {
numRecords++;
}
} finally {
cursor.close();
}
if (true) {
commitTxn(txn);
}
System.out.println("scan records=" + numRecords);
return numRecords;
}
private void databaseDelete(Object txn)
throws Exception {
primaryIndex.delete((Transaction) txn, new CompositeKey(keyNum));
}
}
我的测试 – MyExample.java
import java.io.File;
import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.JEVersion;
import com.sleepycat.persist.*;
/**
* 仅供参考,不作为官方申明!
*
* @author chao
*
*/
public class MyExample {
/* Global settings */
protected static final int EXIT_SUCCESS = 0;
protected static final int EXIT_FAILURE = 1;
/* Variables */
private com.sleepycat.je.Environment env;
private Database db;
private EntityStore store;
private PrimaryIndex primaryIndex;
TestConfig testConfig;
public MyExample(String args[]) {
testConfig = new TestConfig(args);
}
public static void main(String[] args) {
MyExample example = new MyExample(args);
example.execute();
}
/**
* Print the usage.
*/
public static void usage(String msg) {
String usageStr;
if (msg != null) {
System.err.println(msg);
}
usageStr = "Usage: java MyExamplen"
+ " [-h ] [-preload] [-dirtyRead]"
+ " [-useTxns] [-syncCommit]"
+ " [-deferredWrite]n"
+ " [-numThreads ]n"
+ " [-itemsPerTxn ]n"
+ " [-cacheSize ]n"
+ " [-logFileSize ]n"
+ " [-numOperations ]n"
+ " [-operationType ]";
System.err.println(usageStr);
}
/* Set up the test environment. */
public void setup() throws Exception {
/* Create a new, transactional database environment. */
EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setAllowCreate(true);
envConfig.setTransactional(testConfig.useTxns);
envConfig.setTxnNoSync(!testConfig.syncCommit);
envConfig.setTxnWriteNoSync(!testConfig.syncCommit);
envConfig.setCacheSize(testConfig.cacheSize);
envConfig.setConfigParam("je.log.fileMax",
String.valueOf(testConfig.logFileSize));
env = new Environment(new File(testConfig.envHome), envConfig);
/* Open the entity store. */
StoreConfig storeConfig = new StoreConfig();
storeConfig.setAllowCreate(true);
storeConfig.setTransactional(testConfig.useTxns);
storeConfig.setDeferredWrite(testConfig.deferredWrite);
store = new EntityStore(env, "thead", storeConfig);
primaryIndex = store.getPrimaryIndex(CompositeKey.class, BasicEntity.class);
/* Preload the database contents into cache. */
if (testConfig.preload)
preload();
}
/*
* If the "-preload" flag is set, do a scan to bring the database into
* memory first.
*/
public void preload() throws Exception {
EntityCursor entities = primaryIndex.entities();
try {
for (BasicEntity e : entities) {
}
} finally {
entities.close();
}
}
/* The execution method. */
public void execute() {
try {
setup();
/*
* Fork off the threads to perform the transactions.
*/
MyThread[] threads = new MyThread[testConfig.numThreads];
for (int i = 0; i < testConfig.numThreads; i++) {
threads[i] = createThread(i, store, primaryIndex, testConfig,
(testConfig.numOperations / testConfig.numThreads));
}
/* Start all threads here. */
for (int i = 0; i < testConfig.numThreads; i++) {
threads[i].start();
}
/* Wait for them to finish. */
for (int i = 0; i < testConfig.numThreads; i++) {
try {
threads[i].join();
} catch (InterruptedException IE) {
System.err
.println("caught unexpected InterruptedException");
System.exit(EXIT_FAILURE);
}
}
/* Close the database and environment. */
close();
} catch (Exception e) {
System.err.println("TestJEDB: " + e);
e.printStackTrace();
System.exit(EXIT_FAILURE);
}
}
public MyThread createThread(int threadId,
EntityStore store,
PrimaryIndex pIndex,
TestConfig testConfig,
int numOps) {
return new MyThread(threadId, store, pIndex, testConfig, numOps);
}
public void close() throws Exception {
try {
if (store != null) {
store.close();
}
if (env != null) {
env.close();
}
} catch (DatabaseException DE) {
System.err.println("Caught " + DE + " while closing env and store");
}
}
/**
* Parses and contains all test properties.
*/
static class TestConfig {
/* Use dirty reads. */
static boolean dirtyRead = false;
/* Preload records into memory by doing a scan before the test. */
static boolean preload = false;
/* Use the deferred write mode to speed up. */
static boolean deferredWrite = false;
/* Use synchronous commit. */
static boolean syncCommit = false;
/* Use transactions. */
static boolean useTxns = false;
/* Number of threads to use. */
static int numThreads = 1;
/* Number of items accessed per txn. */
static int itemsPerTxn = 50;
/* Cache size, default is 100M. */
static long cacheSize = (200 << 20);
/* JE's Log file size, default is 10M. */
static long logFileSize = (10 << 20);
/* Environment home. */
static String envHome = "./tmp";
/* Operations per test phase. */
// static int numOperations = (1 << 20); // default to 1 million.
static int numOperations = 1000000; // default to 1 million.
/* Operation types include: INSERT, READ, SCAN, DELETE and UPDATE */
static String operationType = "READ";
/* Save command-line input arguments. */
static StringBuffer inputArgs = new StringBuffer();
private TestConfig(String args[]) {
if (args.length < 2) {
usage(null);
System.exit(EXIT_FAILURE);
}
try {
/* Parse command-line input arguments. */
for (int i = 0; i < args.length; i++) {
String arg = args[i];
boolean moreArgs = i < args.length - 1;
if (arg.equals("-h") && moreArgs) {
envHome = args[++i];
} else if (arg.equals("-dirtyRead")) {
dirtyRead = true;
} else if (arg.equals("-preload")) {
preload = true;
} else if (arg.equals("-deferredWrite")) {
deferredWrite = true;
} else if (arg.equals("-syncCommit")) {
syncCommit = true;
} else if (arg.equals("-useTxns")) {
useTxns = true;
} else if (arg.equals("-numThreads") && moreArgs) {
numThreads = Integer.parseInt(args[++i]);
} else if (arg.equals("-itemsPerTxn") && moreArgs) {
itemsPerTxn = Integer.parseInt(args[++i]);
} else if (arg.equals("-cacheSize") && moreArgs) {
cacheSize = Long.parseLong(args[++i]);
} else if (arg.equals("-logFileSize") && moreArgs) {
logFileSize = Long.parseLong(args[++i]);
} else if (arg.equals("-numOperations") && moreArgs) {
numOperations = Integer.parseInt(args[++i]);
} else if (arg.equals("-operationType") && moreArgs) {
operationType = args[++i];
} else if (arg.equals("-help")) {
usage(null);
System.exit(EXIT_SUCCESS);
} else {
usage("Unknown arg: " + arg);
System.exit(EXIT_FAILURE);
}
}
/* Save command-line input arguments. */
for (String s : args) {
inputArgs.append(" " + s);
}
inputArgs.append(" je.version=" + JEVersion.CURRENT_VERSION);
System.out.println("nCommand-line input arguments:n "
+ inputArgs);
System.out.println("Test configurations:nt" + this + "n");
} catch (Exception e) {
e.printStackTrace();
System.exit(EXIT_FAILURE);
}
}
@Override
public String toString() {
return "TestConfig={ " +
" dirtyRead=" + dirtyRead +
" preload=" + preload +
" deferredWrite=" + deferredWrite +
" syncCommit=" + syncCommit +
" useTxns=" + useTxns +
" numberThreads=" + numThreads +
" numOperations=" + numOperations +
" operationType=" + operationType + " itemsPerTxn=" + itemsPerTxn +
" logFileSize=" + logFileSize +
" cacheSize=" + cacheSize +
" envHome=" + envHome + " }";
}
}
}
我的测试结果
* $ java -Xmx192M -cp je.jar:. MyExample -h ./tmp -operationType INSERT
Executed 1048576 INSERT ops in 22734 ms.
* $ java -Xmx192M -cp je.jar:. MyExample -h ./tmp -operationType READ # (an out of cache read test)
Executed 1048576 READ ops in 26025 ms.
* $ java -Xmx1024M -cp je.jar:. MyExample -h ./tmp -preload -operationType READ -cacheSize 512000000 # (a in cache read test)
Executed 1048576 READ ops in 9130 ms.
重要备注:
1. 如果在上面的例子程序中,将entity增加多个Secondary Index,并用多个线程进行读写操作,会出现死锁并且是不能避免的。原因如下:
Be aware that if you are using secondary databases (indexes), then locking order is different for reading and writing. For this reason, if you are writing a concurrent application and you are using secondary databases, you should expect deadlocks.
2. 你可以参考JE关于Locks,Deadlocks那个章节的官方文档,从而来针对性的解决或者处理你程序中的死锁异常。链接地址为:http://www.oracle.com/technology/documentation/berkeley-db/je/TransactionGettingStarted/blocking_deadlocks.html。
3. 至于如何来针对性的解决或者处理死锁异常,请参考SUN的工程师(Jeff)的博客。链接地址:http://forums.oracle.com/forums/thread.jspa?threadID=992240&tstart=0。
4. 如果您还有进一步的问题,欢迎反馈到JE的官方论坛:http://forums.oracle.com/forums/forum.jspa?forumID=273&start=0。
刚刚学习BDB,发现生成的.db文件都是初始化为4k或则8k,16k的,这个大小与什么相关呢,是文件系统的簇么?是不是会根据存储内容成倍的累加?不同的文件系统好像是有差异的,请指示一下。
今天,我测试了一下bdb-je,我用了13502个文档,插入,既然花费了40分钟。
@loiy
是什么原因导致你的测试花费了40分钟呢?你的文档太多,还是每个文档的Size太大?你的机器性能太差?JE本身的问题还是你的设置不合理?你的程序怎么写的?40分钟里面,有多少是花在JE,多少花在I/O上的?
刚刚学习BDB,发现生成的.db文件都是初始化为4k或则8k,16k的,这个大小与什么相关呢,是文件系统的簇么?是不是会根据存储内容成倍的累加?不同的文件系统好像是有差异的,请指示一下。
期待回复,这个是最基本的知识吧。
@eric
这个大小是和db的pagesize相关的。db文件的大小是pagesize的整数倍。你可以通过DB->set_pagesize()来设置这个值的大小。
如果你没有显示指定这个pagesize的值,在当前的BDB版本中,windows下pagesize的默认值是8k,linux下pagesize的值为文件系统的block size。
你可以通过我们的文档获得更详细的信息:
http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/general_am_conf.html#am_conf_pagesize
这个我也不太清楚为什么花了那么多分钟。我测试过IO上面的时间花费,是0-300毫秒,基本都不超过100毫秒。而写数据到数据库中时间花费平均都在10-50毫秒之内。我的文档size不大,都是在10K之内。跟你说的几十秒插入100W。没得比。
现在整个程序跑完需要121秒。我的配置是双核,主频是3.0。1G内存。win2003。
@loiy
不错,看来提高cache大小帮你把性能提高很多 (40分钟 -> 121秒)。你不妨按照前面评论提供的优化建议,进一步优化看,然后把结果再贴出来。
另外,我个人觉得如果是双核的CPU搭配1G的内存,是不是低了点?不考虑你CPU的能力,假设你是32位的总线 (现在一般前端总线速率在1000MHz),则你系统总线每秒给CPU传输数据的带宽约 1000M * 32-bit ~= 4G; 而你双核CPU每秒处理数据的量远远大于4G。建议你还是先升级内存到4G吧。
@chaohuang
谢谢,我觉得也是这个问题.但是我不可能把所有的数据都放入cache.但我觉得20万笔要花50秒在IO上(一个database,50mb log file大小,共有2.5G).好像也有问题吧.是不是我的secondary index没设好啊.我用的是BDB-JE.我目前VM memory 设定1G,Cache 350MB.因为我看文档说,cache也不一定越大越好.再次谢谢.
@Henry
1. 你可以运行DbCacheSize这个命令行工具,得到一个cache大小设置的建议。
2. 你可以做一个测试,看看还有多少提高的余地。比如,尝试在你的机器上把2.5G的数据从一个分区copy到另一个分区。如果时间开销是30-40秒,那么算上JE本身的额外开销,我觉得你现在耗时50秒的提升余地不大;如果时间开销只要10-20秒,那么应该有较大提升的余地。你认为呢?
@
@Linchun Sun
谢谢您的回答,我现在已经搭建了一个应用程序,在linux上用BDB,第一次启动创建表速度还很快,对这些表增加一些内容(也就几十条记录),跑两个小时,然后再次重启这个数据库的话启动速度就慢了很多,要四五分钟,不知道是什么原因导致的,也尝试修改cache大小去检测,但是发现还是很慢。若把数据库全部删除再重新启动的话启动速度又快了。好像数据库里面有一些记录就开始启动慢,记录越多,启动越慢。特咨询一下。
@chaohuang
多谢你的答复.我是资料一天存一个目录,每个目录下有2.5G资料.其实我觉得这个量也不算太大(每天的).而且我的table结构简单,感觉好像不应该那么慢.当然也不是每次50秒.平均新查询一个不在cache里的要花10-20秒(subindex 得到cursor的时间).因为查询可能跨不同天(可能365天的).我不可能把所有的资料都preload到cache.这个是正常的BDB JE的速度吗?谢谢.
通过这篇博客,我了解到有越来越多的国内用户对BDB感兴趣,也了解到一些很有意思的BDB应用案例。在此感谢各位的兴趣和回复,也算为开源社区做了点小小贡献吧。这篇文章权当抛砖引玉了,呵呵。
但由于中国这边手头研发任务的关系,另外我觉得评论回复也不是最好和最有效的方式,所以我将关闭这篇文章的评论功能,希望大家理解。
大家如果有关于BDB(包括BDB-JE和BDB-XML)的问题,欢迎发到我们的官方论坛 – 尽量用英文发帖吧(英文实在不行的,我有时间的话,再尝试帮忙翻译)。要求在发帖中:尽量说的明白,最好提供你的环境配置(CPU,I/O, Memory, OS, JDK等)、使用场景、错误报告。当然,如果你能提供一个简短的程序来重现你的问题,将会大大缩短我们分析你问题的时间,从而得到最快的答复。