Jorge Miguel Calha Rainho Machado
Função: Professor
Number: 20128
Email Institutional: jmachado@estgp.pt
Courses he teaches
-
Engenharia Informática
-
Curso Técnico Superior Profissional - Desenvolvimento para Web e Dispositivos Móveis
teacher in charge
Work Performed and Contributions
others
-
Title |
Architecture and user interface for a geo-temporal search service (Não Publicado)
-
|
year |
2009 |
Institution
|
European Conference on Digital Libraries 2009 |
Description / Summary |
This paper describes the architecture and user interface for a digital
library where resources have geographic and temporal information. We discuss
the importance of separating these two dimensions from the textual one. We
detail the service components, emphasizing the search engine component and
the support services. We present a case study purposed to rebuild the DIGMAP
Search Service architecture. DIGMAP is a coo funded European Union project
on old digitized maps. We discuss the overall architecture of the search service
summarizing the support components which are: a metadata repository (Repox),
a text parsing tool (Geoparser) and the Gazetteer. We follow a mashup
approach, which in this scope comprises the quick creation of systems using
existent components to provide new functionalities over the WEB. Our search
service is based on Mitra system, summarized in this paper, which is a search
engine platform to index spaces with both structured and unstructured
information. We also detail how it was upgraded to use the geographic and
temporal data provided by Geoparser. We detail the new user interface, of the
search engine, built to take advantage of the available services and new ones
provided. |
Electronic file
|
machadoECDL2009.pdf
(292 Kb)
|
-
Title |
User interface for a geo-temporal search service using DIGMAP components
-
|
year |
2009 |
Institution
|
DEMONSTRATION in ECDL 2009 |
Description / Summary |
This demo presents a user interface for a Geo-Temporal search
service built in the sequence of DIGMAP project. DIGMAP was a co-funded
European Union project on old digitized maps, and deals with resources rich in
geographic and temporal information. This search interface followed a mashup
approach using existing DIGMAP components: a metadata repository, a text
mining tool, a Gazetteer, and a service to generate geographic contextual
thumbnails. Google Maps API is used to provide a friendly and interactive user
interface. This demo will present the resulting geo-temporal search engine
functionalities, whose interface uses WEB 2.0 capabilities to provide
contextualization in time and space and text clustering. |
Electronic file
|
ecdl2009VersõesFinais.zip
(9562 Kb)
|
articles
-
Title |
A Micro-analysis of Topic Variation for a Geotemporal Query
-
|
year |
2013 |
Conference / Workshop / Magazine
Institution
|
INESC-ID, National Institute of Electroniques and Computer Systems, Lisbon, PORTUGAL |
Description / Summary |
Bias introduced in question wording is a well-known problem in
political attitude survey polling. For example, the question "The
President believes our military mission in Afghanistan is a vital
national interest -- agree/disagree?" is quite different from the
question: "Do you believe that a military mission in Afghanistan
is in the USA’s vital national interest?" Response variation
according to different question wording has been studied by
researchers in survey methodology. However the influence on
search results from variations of topic wording has not been
examined for geotemporal information retrieval. For the GeoTime
evaluation in NTCIR Workshop 9, the organizers decided to
attempt to do an experiment in query variability in order to study
variability of performance. We took a single information need
and expressed it in three different ways: 1) as a single event
question, 2) as a question which would yield an open-ended list
(e.g. the classic “which countries did the Pope visit in the last
three years”), or 3) a reformulation or the single event question as
a location (latitude/longitude) and time inquiry. This paper
reports the results of this micro-analysis of variation effects upon
a single query expressed in different formats, as well as the
degree of success (or failure) which we achieved (or did not
achieve) our explicit goal of being able to distinguish
performance outcomes for the different formulations. |
Electronic file
|
02-EVIA2011-GeyF.pdf
(907 Kb)
|
-
Title |
Geo-Temporal retrieval filtering versus answer resolution using Wikipedia
-
|
year |
2011 |
Conference / Workshop / Magazine
Institution
|
INESC-ID, Lisbon |
Description / Summary |
We describe an evaluation experiment on GeoTemporal
Document Retrieval created for the GeoTime evaluation task of
NTCIR 2011. This work describes the retrieval techniques
developed to accomplish this task. We describe the collections
used in the workshop, detailing the composition of the collections
in terms of geographic and temporal expressions. The first
contribution of this work is the collections’ statistics, which by
itself reveals the relevance of this subject. Our parsing techniques
found millions of references related with the dimensions of
relevance time and space. Those references were used to index the
documents in order to score them in those dimensions. We also
introduce a technique to find extra references in Wikipedia using
Google Search Service and the same parsers used in the
collections. Those references were used in four different scenarios
depending on the queries: first we used the references found in
topics to filter documents without geographic or temporal
expressions and used pseudo relevance feedback to expand topics
with no references using the indexes created for places and dates;
in other approach we used the Wikipedia references to filter
documents from the result set, in a last approach we expanded all
topics with the Wikipedia references. Finally we used another
technique based on metric distances calculated through
coordinates (latitudes and longitudes) and dates in order to create
a scope for documents and topics, and rank them according to the
distance between each other. |
Electronic file
|
06-NTCIR9-GEOTIME-MachadoJ-2011.pdf
(1034 Kb)
|
-
Title |
NTCIR9-GeoTime Overview - Evaluating Geographic and Temporal Search: Round 2
-
|
year |
2011 |
Conference / Workshop / Magazine
Institution
|
INESC-ID, National Institute of Electroniques and Computer Systems, Lisbon, PORTUGAL |
Description / Summary |
GeoTime for the NTCIR Workshop 9 is the second evaluation of
Geographic and Temporal Information Retrieval called “NTCIR
GeoTime”. The focus of this task is on search with Geographic
and Temporal constraints. This overview describes the data
collections (Japanese and English news stories), topic
development, assessment results and lessons learned from this
second NTCIR GeoTime task, which combines GIR with timebased search to find specific events in a multilingual collection.
Six teams submitted Japanese runs and nine teams submitted
English runs. Three teams participated in both Japanese and
English. |
Electronic file
|
01-NTCIR9-OV-GEOTIME-GeyF-2011.pdf
(958 Kb)
|
-
Title |
LGTE: Sistema aberto de Recuperação de Informação Textual, Geográfica e Temporal.
-
|
year |
2010 |
Conference / Workshop / Magazine
Institution
|
II JORNADAS SASIG, Évora, 2-4 Novembro 2009 |
Description / Summary |
Este artigo apresenta o LGTE1 (Lucene Geo-Temporal Extensions),
um sistema de Recuperação de Informação (RI) textual, geográfica e
temporal que estende o sistema aberto Lucene2, um motor para
indexação de texto escrito em Java. O LGTE é o motor por trás do
serviço de pesquisa3 do DIGMAP4. O LGTE permite indexar colecções
de documentos em XML e inclui uma serie de utilitários para
funcionalidades comuns dos motores de busca que podem ser
facilmente verificadas numa DEMO5 que vem com o pacote. O LGTE
inclui ainda um componente para criar experiências de avaliação de RI
que usa o formato CLEF/TREC e que disponibiliza diferentes utilitários
como stemmers linguísticos e de n-gramas, modelos de expansão de
query, modelos de ranking geográfico [1], modelos de ranking textual tais
como o Okapi BM25, o modelo de linguagem do sistema Lucene-LM6, o
Vector Space Model do Lucene, e os modelos divergence from
randomness do sistema Terrier7. Este artigo apresenta a arquitectura do
LGTE e um tutorial de utilização da ferramenta. |
Electronic file
|
machadoPosterLGTE.pdf
(2240 Kb)
|
-
Title |
GEOTIME: Experiments with Geo-Temporal Expressions Filtering and Query Expansion at Document and Phrase Context Resolution.
-
|
year |
2010 |
Conference / Workshop / Magazine
Institution
|
Proceedings of NTCIR-8 Workshop Meeting, June 15–18, 2010, Tokyo, Japan |
Description / Summary |
We describe an evaluation experiment on GeoTemporal
Document Retrieval created for the GeoTime evaluation task of
NTCIR 2010. GeoTemporal Retrieval aims at to improve retrieval
results using Geographic and Temporal dimensions of relevance.
To accomplish that task, systems need to extract geographic and
temporal information from the documents, and then explore
semantic relations among those dimensions within the documents.
Since this is the first time the task is taking place our aim is to
evaluate some basic techniques in order to set some research
directions of our work. We aim to understand the relevance of
temporal and geographic expressions for filtering purposes. The
geographic expressions were extracted with Yahoo PlaceMaker
and for temporal expressions we used the TIMEXTAG system.
We experimented techniques using both the overall document and
sentence resolutions, as also one mixed approach. We also used a
query expansion mechanism in topics with no filters defined. We
used the BM25 as retrieval model and preprocessed the topics
with a semi-automatic methodology to create structures that let us
create our filters and expansions. We learned that the sentence
level is not a very good approach (but we got clues that probably
the paragraph context resolution could improve the results) and
the geographic and temporal expressions base filters had shown
good performance. |
-
Title |
NTCIR-GeoTime Overview: Evaluating Geographic and Temporal Search
-
|
year |
2010 |
Conference / Workshop / Magazine
Institution
|
Proceedings of NTCIR-8 Workshop Meeting, June 15–18, 2010, Tokyo, Japan |
Description / Summary |
For the NTCIR Workshop 8 we organized a Geographic and
Temporal Information Retrieval Task called “NTCIR GeoTime”.
The focus of this task is on search with Geographic and Temporal
constraints. This overview describes the data collections
(Japanese and English news stories), topic development,
assessment results and lessons learned from the NTCIR GeoTime
task, which combines GIR with time-based search to find specific
events in a multilingual collection. Eight teams submitted
Japanese runs (including unofficial three teams who provided runs
to expand the pools) and six teams submitted English runs. One
team participated in both Japanese and English. |
Electronic file
|
overviewFredGey.pdf
(155 Kb)
|
-
Title |
LGTE: Lucene Extensions for Geo-Temporal Information Retrieval
-
|
year |
2009 |
Conference / Workshop / Magazine
Institution
|
ECIR/WGII, Toulouse, 2009 |
Description / Summary |
This paper presents LGTE, a set of geo-temporal extensions to the
Lucene information retrieval framework initially developed as part of the
DIGMAP project. This paper overviews the functionalities that are available on
LGTE, evaluating the ranking mechanisms proposed for geo-temporal retrieval.
That evaluation focus only the geographic and text models, which was done
against the GeoCLEF corpus with the 2008 English topics. We assigned to each
document a specific geographic region using a geoparser, a text mining tool,
and a gazzeteer, to disambiguate locations (both tools also developed in the
DIGMAP project). We compared different approaches for geographic
information retrieval and concluded that the best performance was achieved by
a linear combination of a language model together with a custom function for
estimating geospatial similarity. We provide the details over our linear
parametric model to maximize the results. |
Electronic file
|
machadoECIR.pdf
(299 Kb)
|
-
Title |
User interface for a geo-temporal search service using DIGMAP components
-
|
year |
2009 |
Conference / Workshop / Magazine
Institution
|
ECDL 2009 |
Description / Summary |
This demo presents a user interface for a Geo-Temporal search service built in the sequence of DIGMAP project. DIGMAP was a co-funded European Union project on old digitized maps and deals with resources rich in geographic and temporal information. This search interface followed a mashup approach using existing DIGMAP components: a metadata repository, a text mining tool, a Gazetteer, and a service to generate geographic contextual thumbnails. Google Maps API is used to provide a friendly and interactive user interface. This demo will present the resulting geo-temporal search engine functionalities, whose interface uses WEB 2.0 capabilities to provide contextualization in time and space and text clustering. |
Electronic file
|
ECDL2009-poster-LGTE.ppt
(4237 Kb)
|
-
Title |
Experiments with N-Gram Prefixes on a Multinomial Language Model versus Lucene’s off-the-shelf ranking scheme and Rocchio Query Expansion (TEL@CLEF Monolingual Task)
-
|
year |
2009 |
Conference / Workshop / Magazine
Institution
|
Cross Language Evaluation Forum |
Description / Summary |
We describe our participation in the TEL@CLEF task of the CLEF
2009 ad-hoc track, where we measured the retrieval performance of LGTE, an
index engine for Geo-Temporal collection which is mostly based on Lucene,
together with extensions for query expansion and multinomial language
modelling. We experiment an N-Gram stemming model to improve our last
year experiments which consisted in combinations of query expansion,
Lucene’s off-the-shelf ranking scheme and the ranking scheme based on
multinomial language modeling. The N-Gram stemming model was based in a
linear combination of N-Gram, with N between 2 and 5, using weight factors
obtained by learning from last year topics and assessments. The Rocchio
ranking function was also adapted to implement this N-Gram model. Results
show that this stemming technique together with query expansion and
multinomial language modeling both result in increased performance. |
Electronic file
|
machadoTelClef2009Springer.pdf
(87 Kb)
|
-
Title |
Definição de Pontos de Vista Arquitecturais: um caso de estudo
-
|
year |
2009 |
Conference / Workshop / Magazine
Institution
|
9ª Conferência da Associação Portuguesa de Sistemas de Informação 28 a 30 de Outubro de 2009 |
Description / Summary |
A gestão de qualidade do Instituto Politécnico de Portalegre (IPP) é um processo de
melhoria contínua que envolve grupos de profissionais de todas as escolas superiores do
instituto. Os grupos de análise desenharam e gerem uma arquitectura empresarial comum
constituída pela modelação dos processos e pela informação necessária, tendo como
objectivo a alimentação de indicadores de desempenho organizacionais definidos no
Balanced Scored Card. No entanto a arquitectura empresarial, dividida em arquitectura de
informação, processos e indicadores, está estruturada em documentos de texto que por sua
vez estão pouco detalhados apresentam desalinhamentos. Estas deficiências tornam
impossível qualquer extracção automática de vistas. Uma vista é uma representação da
organização que captura e apresenta as preocupações de um stakeholder. Neste sentido as
vistas facilitam o processo de análise e actualização da arquitectura o que deverá provocar
um aumento do desempenho da instituição. Este artigo apresenta, em primeiro lugar os
problemas existentes na actual arquitectura do IPP, em segundo lugar o processo proposto
para reformulação da arquitectura empresarial e alinhamento das especificações com a
realidade do IPP, em terceiro lugar é definido um modelo UML para representar a
arquitectura reformulada, em quarto lugar um mecanismo de criação de pontos de vista
definidos em conjunto com esses stakeholders a partir do modelo UML e de um conjunto de
bibliotecas XQuery. |
Electronic file
|
machadoCAPSI2009final.pdf
(561 Kb)
|
-
Title |
Experiments on a Multinomial Language Model versus Lucene’s off-the-shelf ranking scheme and Rochio Query Expansion (TEL@CLEF Monolingual Task)
-
|
year |
2008 |
Conference / Workshop / Magazine
Institution
|
ECDL/CLEF, Ahrus, in Springer LNCS proceedings, 2008 |
Description / Summary |
We describe our participation in the TEL@CLEF task of the CLEF
2008 ad-hoc track, where we measured the retrieval performance of the IR
service that is currently under development as part of the DIGMAP project.
DIGMAP’s IR service is mostly based on Lucene, together with extensions for
using query expansion and multinomial language modelling. In our runs, we
experimented combinations of query expansion, Lucene’s off-the-shelf ranking
scheme and the ranking scheme based on multinomial language modelling.
Results show that query expansion and multinomial language modelling both
result in increased performance. |
Electronic file
|
machadoTelClef2008Springer.pdf
(99 Kb)
|
-
Title |
MITRA: Uma Solução para Serviços de Pesquisa em Intranets.
-
|
year |
2007 |
Conference / Workshop / Magazine
Institution
|
XATA 2007, FCUL, Lisboa, 15 e 16 de Fevereiro de 2007. |
Description / Summary |
Este artigo descreve o sistema MITRA, uma solução para indexação
de conteúdos em linha e metadados descritivos complementares codificados em
qualquer esquema XML. Esta capacidade torna este sistema uma solução ideal
para serviços especializados de pesquisa em intranets. O MITRA baseia-se
numa arquitectura com cinco camadas. A primeira camada é a de recolha de
conteúdos que pode ser implementada por sistemas externos ou sistemas
especializados de transferência de recursos, como por exemplo arquivos locais
estruturados. A segunda camada cria índices invertidos dos conteúdos e dos
metadados recolhidos (usando o sistema LUCENE). A terceira camada gere as
relações semânticas e as associações dos metadados aos recursos. . Uma quarta
camada muito recente permite a implementação de uma metodologia de análise
do domínio. A última camada, a de apresentação, permite receber pesquisas
estruturadas em pedidos HTTP e responder em XML ou HTML conforme
sejam ou não utilizadas XSL’s. O esquema de representação interna dos
metadados é o Dublin Core, o qual permite ao MITRA fornecer naturalmente
uma interface de SRU/SRW, mas outros esquemas podem ser também
configurados. O MITRA combina assim o poder da indexação livre de
conteúdos com o poder do processamento de metadados estruturados,
oferecendo o melhor dos dois mundos. Esta solução é usada como suporte a
vários serviços efectivos, reportados no texto. |
Electronic file
|
10.pdf
(693 Kb)
|
-
-
Title |
Project Markup Language (PML) Schema Proposal
-
|
year |
2006 |
Conference / Workshop / Magazine
Institution
|
XATA 2006, Portalegre |
Description / Summary |
In this paper we present the steps followed to make a proposal for a Project Markup Language (PML). PML is to use in project management solu-tions, like GPRM (Global Project for Research Management) [1]. PML (Project Markup Language) is a markup language for Project Management Servers (like Microsoft Project Server/EPM Servers [10], Global Project Management/GPM Servers or GPRM Server [1]). PML has the main purpose to establish a stan-dard model to Project information, to use it through the various Project Man-agement Applications and Servers. With that we can use search and retrieval index engines (like SPEAK) to have a free communication between different Project Servers and Applications. This paper focuses on the language features and presentation scheme designed for Project Management. |
Electronic file
|
54.pdf
(345 Kb)
|
-
Title |
DEPTAL a Framework for Institutional Repositories
-
|
year |
2005 |
Conference / Workshop / Magazine
Institution
|
DELLOS Workshop. Hiraklion, Grecia, 2005. |
Description / Summary |
This paper describes DEPTAL, an open and flexible framework for institutional repositories reusing open-source technology. DEPTAL is a collection-centric system that manages collections of documents in multiple copies and types. It can manage also users and groups of users, it supports authority control (subjects, authors, etc.), and can interoperate with other systems by interfaces such as OAI-PMH, Z39.50, SRU, web services, etc. DEPTAL recognizes descriptive metadata such as UNIMARC and Dublin Core, and organizes the information objects as HTML sites, with descriptions in the METS structural schema, making it very easy to backup and export those objects. For searching, it interoperates with MITRA, a search engine based on LUCENE, which was extended with new features to index not only the full content but also to recognize the structured metadata. |
Electronic file
|
borbinha.pdf
(129 Kb)
|