Jacob's TN Searchy Thingy

Make suggestions and report problems.
User avatar
KillerB
Taylor Quinta de Vargellas 1987
Posts: 2425
Joined: 22:09 Wed 20 Jun 2007
Location: Sky Blue City, England

Post by KillerB »

Conky wrote:For a Computer half-wit like myself, what stage are we up to? Any estimated time-scale, or is it just when a few of you get round to it (Which I completely understand)
Thank you for understanding.
Port is basically a red drink
Roy Hersh
Niepoort LBV
Posts: 283
Joined: 21:55 Mon 31 Dec 2007

Post by Roy Hersh »

Alex,

You are preaching to the choir. I've been begging a friend of mine for 9 months, if not longer. I'd be happy to have my buddy utilize Jacob's talents and hopefully that will happen at some point sooner rather than later.
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

I’m currently in the midst of exams which has had a somewhat negative effect on my ability to (in roughly order of priority): a) consume Port, b) reply to PMs/emails (apologies!) and c) mess around with bits of computer code. However, I’m just about getting there with sorting this out…hopefully it’ll be done by the end of next week (hangovers caused by celebrating the end of the worst course ever devised notwithstanding).

Anyway, I could do with a little assistance with the names of the shippers and was wondering if people here could help me? I have below a list of the ones which I have included so far and I was wondering if I could have any corrections/additions/removals. The particular problem is that so many obscure bottles have been drunk on the internet that I am soon out of my depth in terms of knowledge!

I’m also hoping to include secondary labels and single quintas. The ones I have so far are marked with a +.

The single * denotes what I think is a non-Douro shipper and a ** denotes a BOB company.

Code: Select all

Adriano Ramos-Pinto
Aguias
Allesvorloren*
Antonion Jose Da Silva
Barros
Barão de Vilar
Berry Bros. Own Brand**
Bodega De Leon
Boplaas Cape*
Borges
Broadbent
Buller
Burmester
Butler
Calem
Carvalhas
Casal does Jordões
Champalimaud
Churchhill
+Quinta da Gricha
Cockburn
+Quinta dos Canais
Croft
Cruz
Dalva
Delaforce
Diez Hermanos
Dow
+Quinta do Bomfim
+Senhora da Ribeira
Dutschke*
Feist
Ferreira
Feurheerd** [? does this exist...I have no idea where I got the name!]
Fiin Gammel*
Fonseca
+Guimaraens
Gonzalez Bypass*
Gould Campbell
Graham
+Quinta dos Malvedos
Harvey
Hooper
Hutcheson
James Eadie
KWV
Kopke
Krohn
Landskroon*
Mackenzie
Martinez
Massandra Lavidia [? Another unknown, possibly a typo]
Messias
Moreira [note to self…check Moriera typo]
Morgadio da Calçada [not to self…check Morgado typo]
Morgan Brothers**
Napa*
Niepoort
+Secundum
Offley Forrester
Osborn
Penfold*
Pintas
Porto Pocas 
Porto Rocha
Pousada
Presidential
Quarles Harris
Quinta Infantado
Quinta Nova de Nossa Senhora do Carmo
Quinta Valle Longe
Quinta da Cavadinha
Quinta da Eira Velha
Quinta da Fonte Nova
Quinta da Foz
Quinta da Prelada
Quinta da Romaneira
Quinta da Ventozelo
Quinta de Baldias
Quinta de Brunheda
Quinta de Roriz
Quinta de la Rosa
Quinta do Castelinho
Quinta do Crasto
Quinta do Fojo
Quinta do Javali
Quinta do Mourão
Quinta do Noval
+Nacional
Quinta do Passadouro
Quinta do Portal
Quinta do Rominera
Quinta do Tedo
Quinta do Vale
Quinta do Vesuvio
Quinta do Vista
Qunita do Loureiro
Qunita do Sibio
Rebello
Romariz
Rovalley*
Royal Companhia
Royal Oporto
Rozès
Sandeman
Santa Eufemia
Smith Woodhouse
Symington
Sâo Pedro Dad Aguias
Taylor, Fladgate and Yeatman
+Quinta de Terra Feita
+Quinta de Vargellas
Thorn-Clarke
Vallegre
Warre
+Quinta da Cavadinha
Finally, I am also looking to produce a list of all possible styles of Port which are likely to ever come up. This, hopefully, should be easier. I should mention that my main aim is to reflect the reality of what has been produced, rather than slavishly follow the IDVP guidlines. So, has anyone come across a style of Port which is not listed below. (Sub-categories are given a †+†).

Code: Select all

Colheita
Crusted
Garrafeira
Late Bottled Vintage
Ruby
+Reserve
Single Quinta Vintage
Tawny
+10 Year Old
+20 Year Old
+30 Year Old
+40 Year Old
Vintage
White
+Colheita
+Dry  
+Extra Dry
+Lagrima
+Leve Seco
+Medium Sweet
+Sweet
Many thanks!
User avatar
DRT
Fonseca 1966
Posts: 15779
Joined: 23:51 Wed 20 Jun 2007
Location: Chesterfield, UK
Contact:

Post by DRT »

Jacob,

Jdaw1 is also in the process of compiling a definative list of shippers names and should be able to provide you with that list by email or PM or by posting it here.

On the styles of port I would suggest you add the following:

Vintage Character

Late Bottled Vintage
+Unfiltered
+Bottle Matured
+Traditional

Ruby
+Unfiltered
+Special Reserve

Pink (Yuck!)

White
+Colheita

Colheita
+White

[choose one or the other of the above for your list. I think the second version is more accurate]

Please also note that in the Tawny sub-categories it should be "Over 40 Years Old". The other 3 Tawny sub-categories are ok.

Derek
"The first duty of Port is to be red"
Ernest H. Cockburn
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Derek T. wrote:Jdaw1 is also in the process of compiling a definative list of shippers names and should be able to provide you with that list by email or PM or by posting it here.
Ah, that would be very helpful. Thanks for letting me know!

In terms of styles:

Vintage Character; an interesting suggestion as I think this term is quite often used to describe Rubies. I'll see if I can sort out the filters to get it to work, though. The only TN I can find for a non-ruby version of it was from an off-line last May.

Late Bottled Vintage; these would certainly make sense (as long as people put a note to that effect in their titles!)

Ruby; certainly can add "unfiltered". I wonder whether "special reserve" denotes anything of consequence beyond "reserve" or whether it would balkanise the Ruby results too much?

Pink (Yuck!); might follow the IVDP's lead on this one and put it under "Ruby"! :D

White Colheitas. Hmm...I did wonder about this one. I think the question is: what makes more sense for someone trying to look for White Colheita TNs. Would you look first under "White" or first under "Colheita"?

"Over 40 Years Old"; fair point (though, bizarrely, "40 years old" is correct across the pond!).

Thanks for the suggestions!
User avatar
DRT
Fonseca 1966
Posts: 15779
Joined: 23:51 Wed 20 Jun 2007
Location: Chesterfield, UK
Contact:

Post by DRT »

JacobH wrote: Vintage Character; an interesting suggestion as I think this term is quite often used to describe Rubies. I'll see if I can sort out the filters to get it to work, though. The only TN I can find for a non-ruby version of it was from an off-line last May.
That is a TN of a ruby. Vintage Character is a now outlawed description for a style of premium ruby. It has now been replaced by the Reserve classification.

Derek
"The first duty of Port is to be red"
Ernest H. Cockburn
User avatar
Alex Bridgeman
Graham’s 1948
Posts: 14880
Joined: 13:41 Mon 25 Jun 2007
Location: Berkshire, UK

Post by Alex Bridgeman »

Jacob,

Do you also need us to provide you with all the variations that we have seen for these names? Although if the search code that you are writing is intended for the moment to only operate on TPF then we have been fairly strict with the naming conventions that we have used for our titles so that any search engine work is made as simple as possible.

Alex
Top Ports in 2023: Taylor 1896 Colheita, b. 2021. A perfect Port.

2024: Niepoort 1900 Colheita, b.1971. A near perfect Port.
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Alex, variations would be good: I intend half the script to be portable so it can be used for other projects (such as building a cellar database from a plain text list) where more liberal naming conventions will apply.

Also, despite everyone’s best efforts, I already need to include quite a few spelling corrections in my script to deal with common errors (particularly the over-zealous application of the “I before E† rule to Portuguese!)

-Jacob
User avatar
jdaw1
Cockburn 1851
Posts: 23613
Joined: 15:03 Thu 21 Jun 2007
Location: London
Contact:

I have code that will parse the names in our TN database.

Post by jdaw1 »

I have code that will parse the names in our TN database.
User avatar
Alex Bridgeman
Graham’s 1948
Posts: 14880
Joined: 13:41 Mon 25 Jun 2007
Location: Berkshire, UK

Post by Alex Bridgeman »

JacobH wrote:I’m also hoping to include secondary labels and single quintas. The ones I have so far are marked with a +.

The single * denotes what I think is a non-Douro shipper and a ** denotes a BOB company.
Adriano Ramos-Pinto
or Ramos Pinto
Adams**
Aguias
Allesvorloren*
or Allesverloren or Allesveloren
Andresen

Antonion Jose Da Silva
or Antonio Jose Da Silva or AJ da Silva or AJS
Avery**
Barros
Barão de Vilar
Berry Bros. Own Brand**
or Berry Brothers & Rudd or BB&R or BBR
Bodega De Leon
Boplaas Cape*
or Boplaas
Borges
Bredell*
Broadbent
Buller
Burmester
Butler
Calem
Carvalhas
Casal does Jordões
or Casal dos Jordoes
Chammisso

Champalimaud
Churchhill
or Churchill
+Quinta da Gricha
Cockburn
+Quinta dos Canais
Constantino
Croft
+Quinta da Roeda
Cruz
Dalva
De Kraans*
Delaforce
Diez Hermanos
Dolamore
Dow
+Quinta do Bomfim
+Senhora da Ribeira
Dutschke*
Feist
Ferreira
Feurheerd** [? does this exist...I have no idea where I got the name!]
Feurheerd does exist but is spelt Feuerheed and has many misspellings common
Fiin Gammel*
Fonseca
+Guimaraens
Gonzalez Bypass*
Gonzalez Byass is a Douro shipper
Gould Campbell
Graham
+Quinta dos Malvedos
Harvey** - were the sole UK agents for Martinez for a while
Hedges & Butler**

Hooper
Hutcheson
James Eadie** - a Birmingham based merchant
KWV* - a South African producer
Kopke
Krohn or Wiese & Krohn
Landskroon*
Mackenzie
Martinez
Massandra Lavidia [? Another unknown, possibly a typo]This is correct, but should be a * port as it comes from the Crimea
Messias
Moreira [note to self…check Moriera typo]
Morgadio da Calçada [not to self…check Morgado typo]
Morgan Brothers**Should not be ** - this is a shipper now owned by Taylors, but is frquently referred to simply and Morgan
Napa*
Niepoort
+Secundum
Offley Forrester or Offley or Offley Boa Vista or Boa Vista
Osborn or Osborne
Overgaauw*

Penfold*
Pintas
Porto Pocas or Pocas
Porto Rocha or Rocha
Pousada
Presidential
Quarles Harris
Quinta Infantado more correctly, Quinta do Infantado
Quinta Nova de Nossa Senhora do Carmo more correctly Quinta da...
Quinta Valle Longe more correctly, Quinta de...
Quinta da Cavadinha
Quinta da Eira Velha
Quinta da Fonte Nova
Quinta da Foz
Quinta da Prelada
Quinta da Romaneira
Quinta da Ventozelo
Quinta de Baldias
Quinta de Brunheda
Quinta de Roriz
Quinta de la Rosa
Quinta do Castelinho
Quinta do Crasto
Quinta do Fojo
Quinta do Javali
Quinta do Mourão
Quinta do Noval or QdN
+Nacional or QdNN or NN
+Silval or Noval Silval

Quinta do Passadouro
Quinta do Portal
+"Portal+" or "Portal +" - "Portal+" is Portal's answer to Sandeman Vau
Quinta do Rominera or Romaniera
Quinta do Tedo
Quinta do Vale
Quinta do Vesuvio
Quinta do Vista
Qunita do Loureiro
Qunita do Sibio
Quinta da Silval (or is it do or de...?)
Rebello
Romariz
Rovalley*
Royal Companhia
Royal Oporto
Rozès
Sandeman
Santa Eufemia
Smith Woodhouse
Symington
Sâo Pedro Dad Aguias
Taylor, Fladgate and Yeatman or Taylor
+Quinta de Terra Feita
+Quinta de Vargellas
++Quinta de Vargellas Vinho Velha
Thorn-Clarke
Vallegre
Warre
+Quinta da Cavadinha
My main suggestion would be to remove the "Quinta da/de/do/dos" from all the names that currently contain them. The only reason for this suggestion is that these are frequently typed wrong, so you would have to be able to identify all the variations on a theme - eg. Quinta da Vesuvio, Quinta de Vesuvio, Quinta do Vesuvio or Quinta dos Vesuvio. It's probably easier just to search out "Vesuvio".
Top Ports in 2023: Taylor 1896 Colheita, b. 2021. A perfect Port.

2024: Niepoort 1900 Colheita, b.1971. A near perfect Port.
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Thanks for that. That’s extremely helpful.

The “Quinta† issue is not particularly serious as the script looks for the main notable word in the title (e.g. “Noval†, “Quinta Da Noval† and even something like “Chateux Noval† would all produce “Quinta Do Noval† as a result) so any mistakes in Portuguese grammar will be ignored!

The three names which I’m a bit worried about are: Ramos-Pinto, Porto Pocas and Porto Rocha. I think the full name “Adraino Ramos-Pinto† is used so rarely that it can safely be filled under “R†, but I’m less sure about which of Porto Pocas/Pocas and Porto Rocha/Rocha is most common. For the former, I suppose it doesn’t really matter as it will end up in pretty much the same place, but I’m less sure about Porto Rocha.

-Jacob
User avatar
jdaw1
Cockburn 1851
Posts: 23613
Joined: 15:03 Thu 21 Jun 2007
Location: London
Contact:

Poças, surely.

Post by jdaw1 »

JacobH wrote:Porto Pocas/Pocas
Poças, surely. (Though my code strips away all diacritical marks before comparison, adding them back for recommended names.)
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Re: Poças, surely.

Post by JacobH »

jdaw1 wrote:
JacobH wrote:Porto Pocas/Pocas
Poças, surely. (Though my code strips away all diacritical marks before comparison, adding them back for recommended names.)
Ah, yes, indeed :D

That’s the approach I’ve been taking, though I think there might be some clever PHP function which does it automatically. I might have a look into that!

I think I’m just about getting there, having written from scratch for the second time…The maxim “You have a problem and decide to solve it by XML. You now have two problems.† comes to mind :roll:

-Jacob
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Finally, I think I’ve made some progress on this script and turned it into something which might be useful.

I’ve uploaded a new version to http://www.jacob-head.com/tpf and would be grateful for any feedback.

The main changes are:
1) Additions to and corrections of the details of a number of producers.
2) Ditto for various Port styles.
3) New search facility (so you can search for all/any of the words in a phrase).
4) Allow the drop-down menus to combine with search box (i.e. you can now search for a term within all Colheitas).
5) Switching to XPath queries rather than loops, which has resulted in it being much faster.

Once everyone is happy with how it looks, I can then add the necessary bits of a code to make it automatically update when someone posts.

Things that could be added include: original poster’s name; date and time of the post; highlighting of non-Douro/BOB shippers and changes to the formatting. Let me know what you think about these.

-Jacob
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Drat, I’ve just noticed that “Nacional† without “Noval† is now not being picked up, for some reason. I’ll correct that when I get a chance (I think it was caused by not wishing to get false positives on single-variety Touriga Nacional ports).
Conky
Fonseca 1980
Posts: 1770
Joined: 23:51 Wed 20 Jun 2007

Post by Conky »

Just had a play with it....Excellent.

I would love a score option, but it's very good as it is.

If I've praised it enough? A suggestion. The look of it needs jazzing up. That funny paisley design doesn't do it for me, and it might benefit from more stylish and distinctive font. But of course that is secondary to doing its job, which it does.

Alan.
User avatar
RonnieRoots
Fonseca 1980
Posts: 1981
Joined: 08:28 Thu 21 Jun 2007
Location: Middle Earth

Post by RonnieRoots »

Great work Jacob!

I would like to see the poster's name and original post date included. That would give a better overview if there are more than one results for a specific port. A couple of remarks on producer names:

Gonzalez Bypass (althoug funny) must be Gonzales Byass

Osborne

Poças is listed twice, use either Poças or Porto Poças

Quinta do Infantado

Quinta do Vale D. Maria

Quinta do Vista must be Vista Alegre
(this is not a single quinta wine)

That's all!
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Conky wrote:Just had a play with it....Excellent.

I would love a score option, but it's very good as it is.

If I've praised it enough? A suggestion. The look of it needs jazzing up. That funny paisley design doesn't do it for me, and it might benefit from more stylish and distinctive font. But of course that is secondary to doing its job, which it does.

Alan.
Thanks! I haven't put any thought into the styling of it, though that is very easy to change. At the moment, it's just using the basic house-style of the website where it is stored, which looks pretty horrific if you are using Internet Explorer rather than Firefox. If we put it on :tpf: we can easily incorperate the current style of the site, or do something different.

Incidentally, the "funny paisley design" is an original tile design by Augutus Pugin from The True Principles of Pointed, or Christian Architecture :P

In terms of the score, that's a bit difficult to do automatically, because the scores aren't in an obvious (for a computer) place. We could look at doing something more complex to try to resolve that (e.g. by manually adding them for existing notes and then having them in a specific place for new ones) but that might have to wait until a version 2.
Last edited by JacobH on 14:51 Tue 17 Jun 2008, edited 2 times in total.
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

RonnieRoots wrote:Great work Jacob!

I would like to see the poster's name and original post date included. That would give a better overview if there are more than one results for a specific port. A couple of remarks on producer names:

Gonzalez Bypass (althoug funny) must be Gonzales Byass

Osborne

Poças is listed twice, use either Poças or Porto Poças

Quinta do Infantado

Quinta do Vale D. Maria

Quinta do Vista must be Vista Alegre
(this is not a single quinta wine)

That's all!
Thanks! I'll make those changes when I'm next at my computer.

I think this has nicely demonstrated that I can neither spell nor type :D
User avatar
Alex Bridgeman
Graham’s 1948
Posts: 14880
Joined: 13:41 Mon 25 Jun 2007
Location: Berkshire, UK

Post by Alex Bridgeman »

I think this is great.

My only request is for another sort button - by vintage.

I do note that where there is a thread (as opposed to a single posting on a tasting) then there can be repeats of the post in the index that comes back.

To illustrate this, try doing a search on "Berry Brothers Selection" and "Vintage Port" as the type and take a look at the two notes that come back for the 1970 vintage.

Thanks!
Top Ports in 2023: Taylor 1896 Colheita, b. 2021. A perfect Port.

2024: Niepoort 1900 Colheita, b.1971. A near perfect Port.
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

Alex, I didn't think there would be much interest in that functionality, but it should be quite easy to sort that out.

As for the second issue, I don't think there's any easy way around it but I don't think there are too many duplicates in the forum. Perhaps it will be improved once I add dates/original poster?
User avatar
JacobH
Quinta do Vesuvio 1994
Posts: 3300
Joined: 16:37 Sat 03 May 2008
Location: London, UK
Contact:

Post by JacobH »

I’ve had a chance to make the changes which have been suggested and re-uploaded it to the same place: http://www.jacob-head.com/tpf.

The “by vintage† function doesn’t work entirely as I would wish, but I think it’s generally ok. The post’s author and the date of the post don’t appear on my website because the data I am using to generate the index doesn’t include it, but it should work once uploaded to :tpf:.

I’ve also added a little javascript which, once a producer is selected, blanks out years and types which do not exist for that producer. (This might be a bit bandwidth heavy, though, so may have to be removed).

Finally, I’ve added the bits of code necessary to make it automatically update, so, perhaps we could look at moving to the next stage (of setting it up on the :tpf: servers) unless there are any more comments or changes? (Which would be more than welcome :))

-Jacob
Post Reply