Jump to content
Great War Forum

Remembered Today:

London Gazette Black Belts


Recommended Posts

I wonder if there are any black-belts in the martial art of searching the London Gazette who would be kind enough to share some tips. I recently ran a few threads on Guards Officers and it was quite apparent that my skills in searching the LG are woeful. Some GWF members are clearly fifth dan black-belts at this.

1. Searching the London Gazette Website. I am usually trying to find the commissioning or promotion announcements in the London Gazette. I understand the LG search function relies on an OCR underlay and the OCR is far from perfect. This will of course miss some entries, however most will be captured. Within these known confines I have taken a few approaces. For example searching for Hugh Francis d'Assisi Stuart Law, Irish Guards I have done the following;

1. In the Text Search box I enter Law Irish Guards. I limit the dates using the date filters 01/01/1914 - 01/01/1919. Edit: and limit to London Gazette to avoid duplication. It generates 56 returns (Edit: 47 if Edinburgh etc are excluded)and most do not contain info on him

2. In the Text Search box I enter H F d'A S Law Irish Guards. Dates limited as above. This generates just 2 returns.

If I drop 'Guards' from the text box it generates 14 returns. Not all are relevant

If I drop 'Irish' from the text box it generates 11 returns. Not all are relevant

If I drop 'Irish' and 'Guards' from the text box it generates 129 returns. Not all are relevant

3. In the Text Search box I enter Hugh Francis d'Assisi Stuart Law Irish Guards Dates limited as above.This generates just 1 return.

None of the approaches captures all the entries for him, except the first which includes lots of irrelevant entries. Checking all the entries for relevance is extremely time consuming. It appears that on initial commissioning this is the only time names are written in full; thereafter only initials and surnames are given. I may be wrong on this point but that seems to be the case. Confirmation would be useful.

2. Google. The alternative method is simply to Google "London Gazette H F d'A Law Irish Guards". The challenge with this approach is tat it only generates single pages and loses the navigation. In some cases the reason for gazetting is on the previous (hidden) page and it is impossible to navigate one page back.

I am trying to understand if there is a more efficient way to generate the correct returns that does not included non-relevant LG announcements and one that still allows one to navigate after reaching the relevant page. If anyone can throw some light on this dark art I would be grateful.

To avoid confusion, I have everything on H F d'A S Law, and do not need his LG entries, I simply am using this as the example to illustrate the point. Thanks. MG

Link to post
Share on other sites

Limiting it to the London edition can also cut out the repetitions in the other editions (I expect you're probably all over that one, though).

- brummell

Link to post
Share on other sites

Martin - I have found that including full stops after initials helps.

- brummell

Thanks. I had experimented but is seems in this example it makes no difference. I will persevere. MG

Limiting it to the London edition can also cut out the repetitions in the other editions (I expect you're probably all over that one, though).

- brummell

Yes. I forgot to mention that. I will amend the OP accordingly. In the past I have simply ignored the Edinburgh gazette returns.MG

Link to post
Share on other sites

Searching 'H. F. d' A. S. Law' (i.e. with a space between the apostrophe and the A) has brought up four results which do not come up without the space - 2Lt to Lt, appointment to A/Capt, relinquishing A/Capt, Lt to Capt.

Any of those milestones new to you?

- brummell

Link to post
Share on other sites

Searching 'H. F. d' A. S. Law' (i.e. with a space between the apostrophe and the A) has brought up four results which do not come up without the space - 2Lt to Lt, appointment to A/Capt, relinquishing A/Capt, Lt to Capt.

Any of those milestones new to you?- brummell

They are not new to me, but that is because I probably found them while researching other Guardsmen, rather than Law. What is important here is that by adding a single space has increased relevant hits - although I get 27 hist not 4, but nevertheless an increase. Thank you. MG

Link to post
Share on other sites
  • Admin

So I take it from this thread that the promised restoration of the Indexes [sic] from the old site that worked before it was passed over to TNA still hasn't happened?

Link to post
Share on other sites

I do the same as Mike just adding what ever Im looking for after it

sometimes try part names as this happens

Hugh Henshall Cliff-

ord Williamson

which is very annoying (more used on ISM names after WW1)

also look at random pages to see how the name is noted ie H H C Williamson, or Hugh H C Williamson

Link to post
Share on other sites

Certainly no black belt. I find just entering this

15hzdkx.jpg

Gets this Click

Mike

Indeed...but as mentioned in the OP one can not navigate from this point..... although I just noticed by changing the last digit of the url it is possible. MG

Link to post
Share on other sites

Put quotes around names, when you're doing this including full stops after initials is essential. This will look for an exact match rather than instances where surname and intials all appear somewhere on page but not in the same name.

So eg "H. F. d'A. S. Law" or "Hugh Francis d'Assisi Stuart Law"

The problem with putting Irish Guards as well as if you have a list of names (eg commissions or a block or promotions), the Irish Guards bit might appear on the previous page, and so teh search would see a page with just the name on, and not Irish Guards.

Occasionally OCR problems mean that breaks between names are missed, or the surname gets "stuck" to the first name on the next line, so you can find that yo'll get results for a search like stuartlaw - with nice distinctive forenames like tese, or a good set of initials searching on just the forenames or just the initials can be a good strategy eg "H. F. d'A. S." or "Hugh Francis d'Assisi".

Google is less fussy about full stops, and also seems less prone to joining lines together so you can get extra hits that way as you suggest. The best way of searching is to use the "site:" parameter along wih the name. Again using quotes rounds things is a good idea. eg site:thegazette.co.uk "H F d'A S Law".

As you've seen, Google by default just returns a single page of pdf, if you look at the URL you'll see something like https://www.thegazette.co.uk/London/issue/32466/supplement/7505/data.pdf - if you just delete /data.pdf off the end (and hit return) you'll get the standard view with navigation controls. Alternatively just changing 7505 to 7504 would give you the previous page. A final alternative is to delete supplement/7505/ from the URL (so you get https://www.thegazette.co.uk/London/issue/32466/data.pdf or even just https://www.thegazette.co.uk/London/issue/32466) in which case you will get a PDF of the whole issue.

The URL is very structured - London obviously tells you which edition it is, then you get issue followed by the issue number, the next bit is either page or supplement (depending on whether it's a regular issue or a supplement - occasionally you find a supplement incorrectly set) then the page number. This means that if you fidn a reference that gives an issue number you can just build the URL for it (it's a shame that so often it's the date of the gazette that's given rather issue on gallantry medal cards and such like).

Link to post
Share on other sites

David. Thank you. I had experimented with using quotation marks but it did not appear to make a difference. I was probably just unlucky with the small number I tried. The inclusion of periods makes sense, particularly as the underlay is OCR.

In my transcription work I strip out unnecessary periods as they clutter the page, so I have a habit of forgetting about them. Clearly the search function is sensitive to a single space or period and therein lies the challenge.

Given the size and importance of the London Gazette archive material it seems short-sighted to have such an unsophisticated search function. Not being able to filter initial results is a particularly weak part of the proccess.

If anyone else has any further suggestions they would be gratefully received. Thanks MG.

Link to post
Share on other sites

By all means use the Gazette OCR search engine with various combinations of name etc., but the O.C.R. does not pick up some words due to worn or indistinct print in the papers. After exhausting this method the only way to find a notification is by searching the index. After 1903 the index was available quarterly, before 1903 twice per year. The index gives the page on which the notification is published in a particular quarter. Up to the third quarter of 1915 Corps and Regiments were listed by order of Army precedence, so you have to know or find out where a particular regiment stands in the order. You also have to know if an officer was a Regular or Territorial Force, or search both sections until you find him first. There is also an index to the index quarterly contents.

When you have found the page number of a notification of interest insert that number in the text search box on the usual page, enter the start and end dates of the quarter in the publication date boxes, e.g. as 01/01/1915 and 31/03/1915 for the first quarter of 1915, and tick the London box in the edition required. click on the update results button and hopefully the result will come up and you can open the PDF for the page. Sometimes it does not again because of the OCR not picking up the page number, in this case use a page number close to the one you want and navigate from that page. Also page numbers can coincide with the likes of service numbers so you can get several hits.

This is the link to the index for the first quarter of the London Gazette for 1915 as an example:-

https://www.thegazette.co.uk/london/index/year/1915/volume/1

For other years, say 1916, insert 1916 instead of 1915, and other quarters change the 1 to 2,3, or 4.

Download and save the indexes PDFs for further use. The Great War years are quite big files, the index contains up to about 800 pages for some quarters.

Then it's just a matter of practice to familiarise yourself with the search and pick up speed. It is the most reliable way to find anything.

Edit to add: Should have added that the index is also useful for finding honours and awards, and M.I.Ds and much else besides officers' promotions.

Link to post
Share on other sites

By all means use the Gazette OCR search engine with various combinations of name etc., but the O.C.R. does not pick up some words due to worn or indistinct print in the papers. After exhausting this method the only way to find a notification is by searching the index. After 1903 the index was available quarterly, before 1903 twice per year. The index gives the page on which the notification is published in a particular quarter. Up to the third quarter of 1915 Corps and Regiments were listed by order of Army precedence, so you have to know or find out where a particular regiment stands in the order. You also have to know if an officer was a Regular or Territorial Force, or search both sections until you find him first. There is also an index to the index quarterly contents.

When you have found the page number of a notification of interest insert that number in the text search box on the usual page, enter the start and end dates of the quarter in the publication date boxes, e.g. as 01/01/1915 and 31/03/1915 for the first quarter of 1915, and tick the London box in the edition required. click on the update results button and hopefully the result will come up and you can open the PDF for the page. Sometimes it does not again because of the OCR not picking up the page number, in this case use a page number close to the one you want and navigate from that page. Also page numbers can coincide with the likes of service numbers so you can get several hits.

This is the link to the index for the first quarter of the London Gazette for 1915 as an example:-

https://www.thegazette.co.uk/london/index/year/1915/volume/1

For other years, say 1916, insert 1916 instead of 1915, and other quarters change the 1 to 2,3, or 4.

Download and save the indexes PDFs for further use. The Great War years are quite big files, the index contains up to about 800 pages for some quarters.

Then it's just a matter of practice to familiarise yourself with the search and pick up speed. It is the most reliable way to find anything.

Edit to add: Should have added that the index is also useful for finding honours and awards, and M.I.Ds and much else besides officers' promotions.

Harry

Excellent stuff. Thank you. I didn't know there was an index, but in retrospect it makes sense. Also your tip on the OCR possibly not picking up the exact page number and using a page number close to the target is a very useful tip. MG

Link to post
Share on other sites

I am not an expert on searches on the London Gazette, but I am pretty good on searches. One technique I use is to do a very wide search on the source materiel. Then use the local search Ctrl F on the returned material.

What did surprise me was I found searching document in pdf is quicker on a android tablet than an W7 i7 laptop with 16 gig of RAM, which was a real shock.

Not tried this on the LG, but use it for work and other research.

Link to post
Share on other sites

I am not an expert on searches on the London Gazette, but I am pretty good on searches. One technique I use is to do a very wide search on the source materiel. Then use the local search Ctrl F on the returned material.

What did surprise me was I found searching document in pdf is quicker on a android tablet than an W7 i7 laptop with 16 gig of RAM, which was a real shock.

Not tried this on the LG, but use it for work and other research.

Given the ultimate limit factor in the LG is the quality of the underlying OCR, in theory, regardless of the hardware/software combinations, one should eventually get the same outputs. If the OCR has misread 'Law' as 'Lew', it wont get picked up using Ctrl F (which is binary - I think)

I have used three different OCR softwares and they sometimes generate radically different outputs. It really depends on the extent to which the software 'tunes into' a particular font. Sometimes one software is better than another. I have no idea which software the LG used but I imagine it was done some time ago. The OCR softwares are improving rapidly.

It is an interesting exercise (when in pdf) to cut and paste the OCR into a word document to see the quality of the OCR work. When transcribing typewritten original work I often simply OCR the pages, cut and paste into word and then do the corrections. If the original is clean and bright and the text is of sufficient size this methodology works. I notice with some pages of the LG the quality of the text is poor and the processing for making the copies has a rather high contrast setting which sometimes negatively impacts the quality of the OCR outputs

Given the limit factor of OCR, it is interesting to hera how others have approached the challenge o searching large bodies of text for individual records.

Here is the OCR underlay for the sample of the LG attached.

Somers-Smith, 2nd Lieut. E., supplementary
(correction) 8616
Lieut 9288
Somerville, 2nd Lieut. R. N . , supplementary
(confirmed) 7333, 7990
Temporary Lieut., supplementary
7333
Spielman, 2nd Lieut. Claude M . ,
supplementary (confirmed) ... 7333
Temporary Lieut., supplementary
7333
Spottiswood, Thomas W. F. (Maj.),
supplementary 7306
Stern, 2nd Lieut. T. H . , supplementary
(confirmed) 7990
Stovell, Frederick (temporary 2nd
Lieut.). 7288
Tart, Cyril James (2nd Lieut., on
probation), supplementary ... 9302
Thomas, John L. Maunsell (2nd
Lieut., on probation) 6941
Thornton, 2nd Lieut. N. H . , supplementary
(correction) 8616
Towneley-Bertie, Capt. The. Hon.
Montague H. E. C. (seconded
with*Royal Naval Air Service) 6688
Tudsbery, 2nd Lieut. M. T., supplementary
(confirmed) ... 7990
Ure, 2nd Lieut. Colin McG., supplementary
(confirmed) 7333
Lieut., supplementary ... 7333
Vachell, 2nd Lieut, (on probation)
E. T., supplementary (correction)
8616
Supplementary (confirmed)... 8633
Viner, Frank (2nd Lieut., on probation)
8633
Wale, 2nd Lieut. E. H . , supplemen-
. tary (confirmed) 7311
Walford, 2nd Lieut, (on probation)
W. G., supplementary (correction)
8616
Confirmed 9283
Lieut 9283
Aside from the formatting, the quality of the output is pretty good.

post-55873-0-91475300-1455967689_thumb.j

Link to post
Share on other sites

I should explain myself more clearly, for example I would search Irish, or what ever to or build search string "Irish"&"Iri"&"Iirs" with a command for unique records and then look using Crtl F for Lieut. I might have a play in the next week if i get time.

Link to post
Share on other sites
  • 2 months later...

By all means use the Gazette OCR search engine with various combinations of name etc., but the O.C.R. does not pick up some words due to worn or indistinct print in the papers. After exhausting this method the only way to find a notification is by searching the index. After 1903 the index was available quarterly, before 1903 twice per year. The index gives the page on which the notification is published in a particular quarter. Up to the third quarter of 1915 Corps and Regiments were listed by order of Army precedence, so you have to know or find out where a particular regiment stands in the order. You also have to know if an officer was a Regular or Territorial Force, or search both sections until you find him first. There is also an index to the index quarterly contents.

When you have found the page number of a notification of interest insert that number in the text search box on the usual page, enter the start and end dates of the quarter in the publication date boxes, e.g. as 01/01/1915 and 31/03/1915 for the first quarter of 1915, and tick the London box in the edition required. click on the update results button and hopefully the result will come up and you can open the PDF for the page. Sometimes it does not again because of the OCR not picking up the page number, in this case use a page number close to the one you want and navigate from that page. Also page numbers can coincide with the likes of service numbers so you can get several hits.

This is the link to the index for the first quarter of the London Gazette for 1915 as an example:-

https://www.thegazette.co.uk/london/index/year/1915/volume/1

For other years, say 1916, insert 1916 instead of 1915, and other quarters change the 1 to 2,3, or 4.

Download and save the indexes PDFs for further use. The Great War years are quite big files, the index contains up to about 800 pages for some quarters.

Then it's just a matter of practice to familiarise yourself with the search and pick up speed. It is the most reliable way to find anything.

Edit to add: Should have added that the index is also useful for finding honours and awards, and M.I.Ds and much else besides officers' promotions.

Harry I finally got round to building the Indexes for 1914, 1915 and 1916. This was an excellent piece of advice. may thanks. MG

Link to post
Share on other sites

Martin,

I am pleased you found the tip useful, but don't stop there. Many Great War Officers served before 1914 and some went on after - keep expanding it. Over time I have downloaded the index from 1880 to 1950. And you will find, if you use a mouse and not a touch pad (as on a laptop), it can take only seconds to search for one item in a particular quarter.

Harry

Link to post
Share on other sites

Martin,

I am pleased you found the tip useful, but don't stop there. Many Great War Officers served before 1914 and some went on after - keep expanding it. Over time I have downloaded the index from 1880 to 1950. And you will find, if you use a mouse and not a touch pad (as on a laptop), it can take only seconds to search for one item in a particular quarter.

Harry

I have very deep respect for your research expertise... along with a dozen other GWF comrages-in-arms (that was a typo but I like it) have expanded my research horizons on a level I can not quite comprehend. It has both shattered and reinforced various research projects I have been incubating.

"I have seen further because I have stood on the shoulders of giants"

I remain in considerable debt to you.

MG

Link to post
Share on other sites

Martin

You are right that the LG search facility is fairly insensitive. It may well be because it was done in the early days of this kind of technology, and hence it suffers from "first time we've tried this" syndrome.

I rarely need to search the LG, and up until a few years ago I had no problem - the UL in Cambridge had bound copies of the LG on open shelves, from 1800 onwards. I once treated myself to half an hour pleasantly reading Wellington's despatch after Waterloo. Now, alas, the LG have vanished from the shelves and they are harder to find.

Ron

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...