I'm an advocate of testing things - it can be a great way to discover different behaviors and learn how things work (or perhaps why they didn't work the way you thought they might!). When it comes to testing database applications, having some test data is certainly helpful. Ideally, I like to have a fully-loaded production database from which to draw this data into a test environment. Sometimes, however, this is not possible. In cases like these I use the "trick" described below to create my own test data.
When I need to generate test data I frequently call upon a seemingly little-known PL/SQL Package supplied with Oracle Database called "dbms_random". As you may guess this package can be used to generated random data. Rather than explain the package details (they are short and you can read them using the link), I'll just present a quick way to generate some data using SQL*Plus and the dbms_random package.
First, I'll need a table:
create table test_data
As you can see, this table is nothing to write home about, but it does mirror what a real-world table would look like in a lot of cases.
When I am creating test data I generally prefer the data to be reproducible. That is, if I execute the process more than once I like to get the same data each time. To accomplish this with the dbms_random package I call the "seed" procedure and provide an initial seed value:
Let's say I wanted to populate this table with 10,000 rows of data. I use the following to do this:
insert into test_data
to_date('01-JAN-2008', 'DD-MON-YYYY') + dbms_random.value(-100,100)
(select level from dual connect by level <= 10000);
Starting at the bottom of the SQL text is a neat trick using "dual" and "connect by" to essentially create an "unlimited dual" for generating lots of rows. This trick was (I believe) originally put forth by Mikito Harakiri and I discovered it via an Ask Tom thread.
The remainder of the text is fairly straight-forward in its use of the dbms_random package. I use the "string" function to generate a random, lowercase value (which is subsequently passed to "initcap" to capitalize the first character of each string), the "value" function is used to create a random numeric value (which is passed to "round" to make it look like a purchase amount), and then I use a fixed-date to which I add (possibly a negative value) to create a set of valid dates within a range.
The first 10 rows of this data when selected from the table look like:
ID FIRST_NAME LAST_NAME AMOUNT PURCHASE_DATE
--- ---------------- ------------------------ ---------- --------------------
1 Oqq Mxsazbwyx 521.33 22-MAR-2008 16:49:40
2 Jjgqrywtxbdn Fwwbzshhkbqzb 921.47 04-OCT-2007 09:10:00
3 Zxflhufls Mstwydowbaogeyyjiles 172.34 20-MAR-2008 10:22:05
4 Zjjxtyysitsog Zxrzqeflxgo 882.16 26-DEC-2007 18:56:44
5 Kjmuvbrqx Hfu 742.61 16-OCT-2007 14:35:27
6 Oywaibiyp Angvlehlmeujfdlhdmtt 664.5 29-FEB-2008 12:50:40
7 Uhwyvla Nhbwcv 168.99 27-DEC-2007 22:29:59
8 Kpdiqafanbvzt Phjeqwelyugrmahybocwbhvp 813.81 01-MAR-2008 09:15:59
9 Tvezuvrgnzzqkpq Pjyygoqx 880.09 21-NOV-2007 00:42:07
10 Olchylbeft Nflaxjqfkmkgt 847.71 07-DEC-2007 16:53:23
10 rows selected.
Can this technique always be used? No, probably not. For example, the names are not exactly what you might call "proper" names. However, I do find that this technique can be useful when I need to create some data to perform some testing with. Perhaps it will be helpful to you too if you experiment with it and find the right combination of values to use in your tests.