Information Management Part II: The Database

Though I’ve been looking at the story of my great uncle Jack and the crew of B for Baker since about 1996, my work has really become serious over the last five or so years. In that time I’ve amassed and created a great deal of information: facts, documents, books, photographs, sound recordings, emails, letters and yes, even blog posts. Keeping track of all of that data is a significant task. In my last post I explained how I catalogue my sources. How to easily access the information contained within those sources forms the second part of my information management system.

The files from the first research I completed on my great uncle, which was for a competition called the National History Challenge when I was still in primary school, all fitted into a single A4 display folder. The second phase of work I did, in the year I took off between high school and university, filled a small portable filing box. I graduated to a two-drawer filing cabinet for paper notes and documents when I decided to start my current (and somewhat obsessive) work about five years ago, but by then digital storage was becoming predominant. While it’s still important to have access to the paper files in that cabinet (and I’ve filled another box with them too), most of my work (including this blog post as I write it) is now saved onto a portable hard drive on my computer and catalogued as I explained in my last post. And having stored and catalogued all of this information, the next step is the ability to easily search the whole lot to find the parts I’m looking for at any particular time. That is where a solid digital database comes in handy.

The one I eventually settled on is a self-contained piece of software called Personal Knowbase. It’s a simple, easy-to-use program that allows me to save my information in text-based articles and attach any number of relevant keywords to each article. Each article has a date stamp which can be set to any date – like the date of the letter or document under study, for example. I can then easily pull up any articles tagged with a particular keyword, or combination of keywords. Some examples of tags I’ve used might be general themes (‘training’, ‘flying’, ‘operations’, ‘England’ etc), individuals’ names, book titles, aircraft types, targets, airfields and so on and so on. I’ve also used my catalogue numbers as keywords which makes it easy to locate the source of any specific quote. I can easily run a more customised search using a combination of keywords, basic text searches and date ranges, across the entire database or a selected subset of it. Searches can then be exported in various formats for printing or review.

The point is it is very easy to access information when it is needed, and to be able to tell where that particular piece of information came from. And if I’ve used my keywords effectively, I can also pull up related articles – very helpful when looking for everything I have on a particular raid that happened on 10 May 1944, for example…

It’s not a perfect system. Simple things like typographical errors can make word searches difficult, and care needs to be taken to use appropriate keywords to avoid burying an important fact under a pile of other stuff. And like anything computer-related, the file is susceptible to a hardware failure or a file corruption, for example – the latter of which happened to me late last year. Happily, a solid backup regime provides a certain degree of redundancy and I was able to recover my file without losing too much work (the cause was eventually traced to a dodgy portable hard drive). My database is now automatically backed up in two separate locations, one of which is ‘off-site’, and I make an occasional manual copy too ‘just in case’.

Overall, it’s a useful bit of gear. I have the database window open on one side of my screen whenever I’m working on my research. Together with the catalogue spreadsheet, the database makes it easy to store, search and find virtually anything in my collection of sources.

© 2013 Adam Purcell


Information Management Part I: The Catalogue

I have something in the order of 13,000 individual pages in the sources listed in my catalogue for this project. Dealing with the sheer volume of stuff that I’ve gathered remains one of the big challenges of the work. Making interesting discoveries won’t do much good if I then forget where the discoveries came from, how I made them and where they fit into the story when it eventually comes time to write my planned book. There are two keys to my information management system: the catalogue, for knowing which source information comes from, and the database, for knowing what information is in those sources. This post will look at how I catalogue my sources.

If you’ve been reading this blog closely over the last few years, you may have noticed the occasional strange group of letters and numbers popping up in the posts themselves. The codes are in fact references to my catalogue of sources, and are the way I keep track of where my information comes from. The codes look like this, for example, from my ‘Accidents’ post of September 2010:

The second engine faltered shortly after crossing into England so they sought out an emergency aerodrome and, in Phil’s memorable understatement… (B03-001-016)

“…we crash landed rather unsuccessfully…”

Or these two, in ‘Motivations’ (November 2012):

Bill Brill was ‘getting a little accustomed to being scared’ (C07-036-159). And there is no doubt that airmen knew very well exactly how low their chances of surviving a tour were. Gil Pate wrote to his mother in November 1943 (A01-409-001): “It seems an age since I last saw you all + I guess I’ll need a lot of luck to do so again, the way things happen.”

The catalogue was one of the first things I set up when I got seriously stuck into this work in about mid 2008 and though it’s not an incredibly sophisticated system it is quite effective in keeping track of all the sources I’ve gathered over the last few years. It lives on a (well backed-up) multiple-tabbed Excel spreadsheet that I continually add to whenever I obtain a new source. Broadly, the code is split into four groups:

Designator-Series-Item Number-Page Number

The Designator tells me what type of source I’m looking at. It is the first letter in the group, and translates as follows:

A: Original (ie Primary) Documents, Scans or Copies – a document that originated during the war or the immediate period thereafter

B: Transcripts of original documents (used when I have not seen the original – ie someone else has transcribed it)

C: Post-war (ie Secondary) material

The types are broadly defined and can sometimes be a little ambiguous – at this stage it is not critical to define each type precisely. A general idea is sufficient.

The Series indicator defines the broad category under which the source fits. It decodes like this:

01: Letters and Telegrams (including letters I’ve personally sent and received)

02: Flying Logbooks

03: Diaries and Operational Record Books

04: Official Documents and Service Records – typically these documents come from archival collections such as the National Archives of Australia []

05: Photographs

06: Articles, Newspaper Clippings and Media Reports – including magazines

07: Books, Memoirs and Video

08: Databases

09: [Currently spare]

The Item Number is simply an increasing three-digit number for each individual document in each series (disregarding the designator), allocated in order of cataloguing. I haven’t yet reached greater than 999 items in any category, but if I do I’ll simply transition to a four-digit number for subsequent items.

The three-digit Page Number is the page of the document on which the actual quote or information can be found. Obviously for single-page documents or photos this would remain 001. If it’s a really long book with more than 999 pages, there is nothing stopping me using a four-digit page number.

So putting it all together, using as the first example the reference from my ‘Accidents’ post quoted above:


This refers to a transcript (B) of a diary (03), which was the first diary I catalogued (001). The quote can be found on page 6 (006).

I then take those details over to my Excel spread sheet, where I find…catalogue

Which tells me that it’s a quote from Phil Smith’s diary. I originally got the document from Mollie Smith, and my copy of it resides in my filing cabinet in the folder ‘Smith, Phil’.

The quote from Bill Brill, in my second example above, has the catalogue number C07-063-159. So it comes from page 159 of the 36th book I catalogued, and the book was written post-war. Referring to my handy spreadsheet, I can see its source is Hank Nelson’s book Chased by the Sun, and that there’s a copy of it on my shelf should I feel the need to check the quote.

I’ve used this cataloguing system throughout this blog mainly for my own benefit, so that when I start writing the book I keep telling myself I want to write I can easily find where I found all of my information when I was writing the blog. Obviously the final book will be properly referenced rather than being interrupted by my own strange system of code groups of letters and numbers – but behind the scenes, when I’m doing the research and the actual writing of the story, it’s a quick and easy short-hand method of accurately keeping track of exactly where my information comes from and ensuring that I can easily check my sources for accuracy where required.

I have also incorporated the referencing numbers into the second part of my information management system – the database. Conveniently, that will be the subject of my next post.

© 2013 Adam Purcell

Who did what, when and in which aircraft?

At the Canberra Bomber Command weekend a few years ago, Don Southwell made an off-hand comment about how he wanted to re-do the Squadron histories. I’m beginning to see why!

I’m currently having a fairly close look at the activities of 463 and 467 Squadrons for the time that the crew of B for Baker were at Waddington. I’ve pulled a variety of sources that I’m going through and cross-referencing to try and build a picture of what happened for each day in the period – and, not surprisingly, I’ve found a number of inconsistencies. Take the latest one, for example. Here is an entry from Nobby Blundell’s squadron history, They Flew From Waddington!, written in 1975 and privately published, concerning 29 January 1944:

Berlin again. 467 Sqn F/Lt Simpson’s a/c was attacked by an ME110 – F/Sgt Campbell, the rear gunner, shot it down. We lost 6 a/c from Waddington, 3 from each Squadron, our worse [sic] night to date, 467: ED772, DV378 and ME575. 463: HK537, JA973 and ED949. 43 aircrew in one op. lost.

On the face of it, this seems straightforward. From a single operation to Berlin in late January 1944, Waddington lost six crews. It is true that this was right in the thick of the period that later became known as the Battle of Berlin, and as such there were many raids to that city around that time. The only problem is, this particular Berlin trip appears in none of the other records I’ve looked at. The 463 Squadron Operational Record Book for example, says this:

A dull day. No Ops. Routine work.

And 467 Squadron said this:

Much sleeping today, and a stand down in the afternoon. The usual Saturday night dance was held.

No sign of any operations there, then. Indeed, the Night Raid Report for this date shows that only small forces of Mosquitos were operating on this night.

But I always like to think cock-up before I think conspiracy. It’s unlikely that Blundell would have made the entry up entirely. Far more likely, I think, is that he’s mixed up a few raids and put them into one entry. So I thought I’d have a look around that date and see what went on elsewhere. From the Bomber Command Night Raid Reports and Operational Record Books for 463 and 467 Squadrons, here are the main operations for a few days:

  • 27 January: 530 heavies to BERLIN; 32 lost. 32 aircraft from Waddington; 463 lost one and 467 lost two.
  • 28 January: 683 heavies to BERLIN; 43 lost. 26 from Waddington, one lost from each Squadron.
  • 29 January: No Main Force operations. Squadrons stood down.
  • 30 January: 540 heavies to BERLIN. 33 lost. Waddington sent 24 aircraft. 463 Squadron lost four and 467 Squadron lost one.
  • 31 January: No Night Raid Report, so no ops.

The most likely suspect, looking at this run of operations, is the trip on the 30th. But Blundell claims six losses on that night, not the five in the ORBs. I needed to look deeper.

The only other details that Blundell recorded are the serial numbers of the aircraft lost:

From 467 Squadron:

  • ED772
  • DV378
  • ME575

From 463 Squadron:

  • HK537
  • JA973
  • ED949

So I thought that was a good place to go next. From the Operational Record Books, we get this:

Pilot                             Squadron                     Aircraft

Messenger                   463                              ED772

Hanson                        463                              JA973

Dunn                           463                              ED949

Fairclough                   463                              ED545

Riley                            467                              DV372

Comparing that to Blundell’s list, we see he has accounted for the first three, so I’m now fairly confident that he’s got the wrong date and the operation he is referring to is indeed that of the 30th. But he attributes ED772 to the other squadron, includes DV378, ME575 and HK537 in his list and completely misses ED545 and DV372. To work this one out, it’s time to find another source.

Bruce Robertson wrote a book called Lancaster: The Story of a Famous Bomber, published in 1964. In the back section are lists upon lists of Lancaster serial numbers, and what happened to each aircraft. So I checked the serial numbers from the ORB, and from Blundell’s extract. Robertson agrees that the first three were lost on 30 January and shows it as 463 – which agrees with the ORBs. ED545, says Robertson, was lost on 14 May 1943 – seven months before the night in question – so it must be an error in the ORB. DV372 survived the war and was scrapped in October 1945, so that one must also be incorrect. With those two ORB records now empty we have two outstanding aircraft (those flown by Fairclough and Riley) and three unknown serials (DV378, ME575 and HK537).

Robertson comes to the rescue again: ME575 was lost on January 27 (one of the other Berlin trips at the end of that month), and indeed the 467 Squadron ORB agrees that this aircraft went missing on that night.

DV378 is very close to DV372, so it is possible that the Orderly Room clerk who typed up the ORB made an error. And indeed, Robertson shows that DV378 went missing on 31 January. Since aircraft returned from the 30 January operation close to and in many cases beyond midnight, and there is no Night Raid Report for the 31st, it is reasonable to suggest that this is the correct serial number for the aircraft flown by P/O Riley.

That, then, leaves HK537 which, again, Robertson records as being lost on 31 January. That is fairly solid evidence that it was indeed the aircraft flown by P/O Fairclough.

So based on this, the list from above should, I believe, actually look like this:

Pilot                             Squadron                     Aircraft

Messenger                   463                              ED772

Hanson                        463                              JA973

Dunn                           463                              ED949

Fairclough                   463                              HK537

Riley                            467                              DV378

All of this goes to show how important it is to cross-check your sources. The ORBs, while considered the definitive record of what happened on each squadron, vary significantly in quality, depending on the individual officer who wrote them at the time. They were compiled at a time when aircraft were being lost and new aircraft and crews were arriving on squadron virtually every day and as such errors could and did creep in. It takes a bit of patience to painstakingly sort through the records and check other relevant sources to try and find out what actually happened.

I think Don Southwell is on to something when he says he’d like to re-do the squadron histories. It would be a very long job to go through the entire Operational Record Books for both Squadrons to try and find these sorts of errors, but I think it would be a worthwhile exercise if it meant that the histories could be more accurate. What form the histories would then take needs more thought and is, perhaps, a subject for a future post.

©2013 Adam Purcell

Use the Source, Luke

My research catalogue for this project includes about a thousand individual items. And those are just the ones that I have catalogued; there are many more that sit in a great pile on my bookshelf waiting to be looked at. They are from a wide variety of sources and types. There are personal letters, logbooks and photographs. There are service records, casualty files and night raid reports. There are audio recordings, interview transcripts and videos. And there are books – there are many, many books; some written by people who were there, and some written by people who were not there.

No one source can tell the whole story, though – in one sense, this is why there are so many individual items in my catalogue! To build a more complete picture of ‘what really happened and why’ (which, after all, is one of the reasons for doing this work in the first place), multiple sources need to be consulted and compared as a whole.

A pilot’s logbook, for example, can offer a full record of what flights the pilot made and when they went on them. The more fastidious pilots also recorded who they flew with, in which aircraft, and even over which route they flew, which are all Really Useful Facts for a historian. But what a logbook doesn’t necessarily reveal is why each flight was made. Take, for example, this one, which appears in S/Ldr Phil Smith’s wartime logbook on 06MAY44:

Aircraft: Oxford. Pilot: Self. Crew: -. Duty: Base – Coningsby and return. 0.30hrs Day.

This is the first flight in an Oxford that I can find in Phil’s logbook at all (though he did significant flying in the very similar Avro Anson during his training), and it is quite an odd flight to find in the logbook of an operational bomber pilot. Indeed, later that night, Phil led his crew on a bombing operation to an ammunition dump at a place called Sable-sur-Sarthe in France. So what on earth could he have been going to Coningsby for?  To find the answer, I needed some other sources.

A few years before he died, Phil wrote an unpublished 29-page typescript for the benefit of his grandson, entitled ‘Phil’s Recollections of 1939-45 War’. I’m lucky enough to have a copy of it and I had cause recently to go through it to see if I could match his (mostly undated) reminiscences with actual flights in his logbook. And, funnily enough, that odd little flight the fifteen or so miles from Waddington to Coningsby is one of those he wrote about.

“For this raid I was appointed ‘Controller’ which meant that I would maintain contact between the target marking Mosquitoes and the main force of Lancasters carrying the bombs. In the afternoon before the raid, the station commander ordered me to visit the target marking people on the nearby aerodrome, Conningsby [sic]. I duly went over there in our Oxford aircraft, a type I had not flown for more than a year.”

But why would Phil need to do that? At the time of the Sable-sur-Sarthe operation, Bomber Command was increasingly becoming engaged on operations against French targets in the lead-up to D-Day. That much is clear from a perusal of Night Raid Reports for this period, in the UK National Archives (AIR14/3411). This trip was no exception. Great care was taken to be accurate on these trips – for the sake of effectiveness of the attack itself, but also to avoid French civilian casualties – and new, far more accurate marking techniques had begun to be developed. This is touched on in a 1951 book called No. 5 Bomber Group RAF by WJ Lawrence (p.164) Indeed a week previously the crew of B for Baker were on an operation to attack a munitions factory at St Medard-en-Jalles, near Bordeaux, but were ordered to return with their bombs when smoke and haze made accurate visual marking of the target impossible. (The bombers returned the next night and blew the munitions factory out of existence.) Phil Smith, having been appointed Controller for the upcoming raid, went to Coningsby to discuss tactics with the people who would be marking the target for the force he was to control.

So that curious little trip in Phil’s logbook now has an explanation. The primary source (logbook) has been complemented by a range of other documents, both primary (night raid reports) and secondary (Phil’s typescript and the 5 Group book) to come up with a picture of what happened and why.

It’s only a minor detail in the overall scheme of things, but it adds a little bit of colour to an otherwise dry logbook entry. And it gives the history just that little bit more life.

© 2013 Adam Purcell