Author Archives: braverock

About braverock

Quantitative developer currently producing models using the Q language and the KDB database from KX systems.

Performance Comparison with R, Rcpp (C++) and KDB

Performance Comparison with R, Rcpp (C++) and KDB

tsl-layout-windows-03

  • World Bank Data
  • Parallel caption
  • Real time to Hive

KDB – Converting a non-keyed table into a keyed table

Turning a non-keyed table into a keyed table

Define a non-keyed table from the Q command prompt as follows:

q) FxPairTable : ([] sym:`symbol$(); date:`date$(); val:`float$())

Next insert illustrative data:

insert [`FxPairTable] (`EURUSD; 2012.02.10; 1.2874)
insert [`FxPairTable] (`EURUSD; 2012.02.10; 1.2874)
insert [`FxPairTable] (`EURUSD; 2012.02.11; 1.2901)

In practice, I prefer to put the data load commends within a Q file to be sourced. This makes for easier development, and a more repeatable process.

We can list the table contents by typeing the table name at the command prompt:

q) FxPairTable

My table was listed as follows:

sym date val
------------------------
EURUSD 2012.02.10 1.2874
EURUSD 2012.02.10 1.2874
EURUSD 2012.02.11 1.2901

How do we convert this table into a keyed table?

`sym`date xkey `FxPairTable

Now, again we list the table by typing the table name at the command prompt:

q) FxPairTable

which displays the following table:


sym date | val
-----------------| ------
EURUSD 2012.02.10| 1.2874
EURUSD 2012.02.10| 1.2874
EURUSD 2012.02.11| 1.2901

First International Workshop on Modeling and Management of Big Data

Call for Papers

First International Workshop on Modeling and Management of Big Data (MoBiD 2013)
http://www.lucentia.es/workshop/mobid13/

In conjunction with the 32nd International Conference on Conceptual Modelling (ER2013)
November 11-13, 2013, Hong Kong
http://www.hkws.org/conference/ER2013/

Special issue in a journal listed in JCR
Best selected papers of MoBiD 2013 will be invited to submit an extended version in a special issue of the Expert Systems journal
================================================================

Introduction
—————-

Due to the enormous amount of data present and growing in the Web, there has been an increasing interest on incorporating this huge enormous amount of external and unstructured data, normally referred as “Big Data”, into traditional applications. This necessity has made that traditional database systems and processing need to evolve and accommodate them to this new situation. Two main ideas underneath this evolution are that this new external and internal data (ii) needs to be stored in the cloud and (ii) offers a set of services in order to be able to access to this data. Following this consideration, there have lately been several proposals (also called as the next generation of database systems) based on Hadoop and Hive systems (framework inspired by Google’s MapReduce and Google File System).

Therefore, this new conception of cloud applications incorporating both internal and external Big Data requires new models and methods to accomplish their conceptual modelling phase. Thus, the objective of MoBiD’13 is to be an international forum for exchanging ideas on the latest and best proposals for the conceptual modeling surrounding this new data-drive paradigm with Big Data. Papers focusing on the application and the use of conceptual modeling approaches (e.g. based on EER, UML and so on) for Big Data, MapReduce, Hadoop and Hive, Big Data Analytics, social networking, Security and privacy data science, etc. will be highly encouraged. The workshop will be a forum for researchers and practitioners who are interested in the different facets related to the use of the conceptual modeling approaches for the development of this next generation of applications based on these Big Data.

Target audiences and the scope
———————————————

The scope of the workshop includes but is not limited to:

– Agile modeling
– Advanced applications with MapReduce paradigm
– Application design
– Big Data Analytics
– Business Process Modeling
– Business Intelligence applications’s modeling
– Conceptual modeling approaches (UML, EER, etc.) for Big Data
– Conceptualization for data-drive paradigm
– Data-driven businesses
– Enterprise modeling
– Fundamentals of Hadoop: data integrity and file-based data structures
– Hadoop versus MapReduce
– Hive and Hadoop: Architecture and File System
– Hive as a tool to enable easy data extract/transform/load (ETL)
– Hive and Hadoop: examples of applications (yahoo, facebook, etc)
– Information packaging
– Knowledge management for big data
– Metamodeling
– Measurement for social network data
– Need to develop a MapReduce applications
– New modeling approaches for Big Data
– Interface design
– Model-driven development methodologies and approaches
– Model transformations
– Provenance modeling
– Process modeling
– Relational Database Management System-RDBMS versus MapReduce
– Requirements modeling for Web-based applications
– Social networking, Security and privacy data science
– Software As a Service (SaS) modeling solutions
– Use of Hive and Hadoop in social networks
– Visualization of big data
– Analytics for complex data
– Data analytics as a service
– Data mining over the cloud
– Extracting, Transforming and Loading data over the cloud
– Smart Cities

Workshop Chairs
—————

Il-Yeol Song
Drexel University, USA
Email: songiy@drexel.edu

David Gil
Lucentia Research Group
Dept. COmputer Science and Technology
University of Alicante, Spain
Email: dgil@dtic.ua.es

Carlos Blanco
Department of Mathematics, Statistical and Computation
University of Cantabria, Spain
carlos.blanco@unican.es

Program Committee
—————————–

Yuan An (Drexel University, Philadelphia, USA)
Marie-Aude Aufaure (Ecole Centrale Paris, France)
Michael Blaha (Yahoo!, Inc.)
Rafael Berlanga Llavori (Universitat Jaume I, Spain)
Gennaro Cordasco (Universita di Salerno, Italy)
Alfredo Cuzzocrea (University of Calabria, Italy)
Gill Dobbie (University of Auckland , New Zealand)
Eduardo Fernandez-Medina Paton (Universidad de Castilla-La Mancha, Spain)
Matteo Golfarelli (University of Bologna, Italy)
Inma Hernandez (University of Sevilla, Spain)
Magnus Johnsson (University of Lund, Sweden)
Nectarios Koziris (National Technical University of Athens, Greece)
Jiexun Li (Drexel University, Philadelphia, USA)
Alexander Loeser (Universitat Berlin, Germany)
Antoni Olivè (Universitat Politecnica de Catalunya, Spain)
Jeffrey Parsons (Memorial University of Newfoundland, Newfoundland and Labrador, Canada.)
Oscar Pastor (Universidad Politecnica de Valencia, Spain)
Mario Piattini (Universidad Castilla-La Mancha, Spain)
Nicolas Prat (Ecole Superieure des Sciences Economiques et Commerciales, France)
Sudha Ram (University of Arizona, USA)
Carlos Rivero (University of Sevilla, Spain)
Colette Roland (Universite Paris, Pantheon Sorbonne, France)
Pablo Sanchez (University of Cantabria, Spain)
Keng Siau (Missouri University of Science and Technology, USA)
Alkis Simitsis (Hewlett-Packard Co, Palo Alto, California, USA)
Julia Stoyanovich (University of Pennsylvania)
Alejandro Vaisman (Universidad de la Republica, Uruguay)
Panos Vassiliadisy (University of Ioannina, Greece)
Ambrosio Toval (University of Murcia, Spain)
Marta Elena Zorrilla Pantaleon (University of Cantabria, Spain)

Submission Guidelines
——————————–

Formatting instructions
MoBiD 2013 proceedings will be part of the ER2013 Workshop volume published by Springer in the LNCS series. The authors must submit manuscripts using the Springer-Verlag LNCS style for Lecture Notes in Computer Science. See the page
http://www.springer.de/comp/lncs/authors.html
for style files and details. The page limit for workshop papers is 10 pages.

The organizers will oversee a peer-review process for the submitted papers. Manuscripts not submitted in the LNCS style or having more than 10 pages will not be reviewed and thus automatically rejected. The papers need to be original and not submitted or accepted for publication in any other workshop, conference, or journal. Submission to MoBiD 2013 will be electronically only.

Submission instructions

All workshop abstracts and papers should be uploaded by using the EasyChair system: https://www.easychair.org/conferences/?conf=mobid2013

Special issue in a journal listed in JCR
Best selected papers of MoBiD 2013 will be invited to submit an extended version in a special issue of Expert Systems

Important Dates
———————-

Paper submission: May 24, 2013
Paper notification: June 14, 2013
Camera-ready paper submission: July 12, 2013
Author registration: July 21, 2013
Workshop: Nov 11-13, 2013

Tendron Systems Ltd

Creating a table in KDB using Q

Creating a table in KDB using Q

Creating and populating a simple keyed table with time series identifiers (sym), dates (date) and values (val)

[sourcecode language="kdb"]

q)TestTable : ([sym:`symbol$(); date:`date$()] val:`float$())

[/sourcecode]

The contents of the table will be displayed simply by typing the table name at the command prompt:

q> TestTable

And an empty table will be displayed, just the meta-data ie the column names with no rows shown:

sym date| val ——–| —

We can now insert our first value:

q> insert [`TestTable] (`GBPUSD; 2011.02.10; 1.5074)

On typing the table name, the contents are displayed in tabular format

q> TestTable

you will now see

sym    date      | val —————–| ——

EURUSD 2011.02.10| 1.5074

You can also

q> select * from TestTable

and this generates the same result:

sym    date      | val -----------------| ------

GBPUSD 2011.02.10| 1.5074

Is it possible to insert several rows at once?

Yes,  Note that there is no space in “`USDJPY`ARSRUB”.

q> insert [`TestTable] ([sym: `USDJPY`ARSRUB; date: 2009.02.10 2009.02.11]; val: 1.2874 10.252969)

The result of the previous insert statement is:

MySQL Using LOAD DATA INFILE

The LOAD DATA INFILE command is a flexible way to upload bulk volumes of data and is especially useful for loading data files which have been submitted in a csv format.

 

LOAD DATA LOCAL INFILE ‘file.csv’ INTO TABLE my_table
FIELDS TERMINATED BY ‘,’
ENCLOSED BY ‘”‘
LINES TERMINATED BY ‘\n’
(name, address, @var1)
set dateOfBirth = STR_TO_DATE(@var1, ‘%d-%b-%y’)

In the above code segment I use the SET clause, along with a variable to reference the contents of the row at that column. In the column list, I assign the date column to a variable name. I can then use it in the SET statement to provide the necessary date formatting rule for the data file in question.