Top 5 things I learned at Big Data LDN

Harris Asad 28-Sep-2022 13:07:33

I attended the Big Data LDN event as part of the development team to learn more about data science, consult with experts, and view the latest product launches and demos. As someone attending the event for the first time, I was blown away by the sheer amount of innovation and change happening in the big data world. Here are the top five things I learned after attending different conferences and chatting with some industry professionals.


1. The Data Science Process is Getting Democratised

What stood out during the event was the sheer number of companies trying to democratise the data science process. Whether it's fetching the data, normalising it, modelling it or just trying to engineer features out of it, much of the process was being made accessible by using drag and drop simple GUI tools. Although the Veracity and precision of these tools still need to be tested in the industry, it was still a great showcase of how smaller teams can use off-the-shelf tools to make sense of their data and provide value quickly to their internal and external stakeholders. 

The one company that stood out to me in this domain was Dataiku, which stands at mostly the downstream part of the data science pipeline. What was impressive about this software was how easily you can access, explore, orchestrate, visualise, wrangle and feature engineer your data and then apply predictive machine learning techniques with their AutoML feature. All this process can be done without any code, and the visual interface of the data science pipeline can make data teams more efficient.

2. More Specialty is Getting into The Data World

Apart from the three major public cloud providers offering services from start to finish, which cater to data teams to scale and production quickly, there were a plethora of new companies which specialised in either aspect much better than any service offered by the big three cloud providers. There were companies that, for example, specialised in conforming your data to the General Data Protection Regulations (GDPR) or some that specialised in making a dedicated ETL/ELT tool like what Fivetran offers. 


Companies like dbt (data build tool) specialised in transforming data in the warehouse (The T in ELT), and others like snowflake specialised in acting as a much more efficient data warehouse. Overall the theme was familiar, companies were overwhelmingly specialising in one domain in the data engineering/data science pipeline, and they marketed themselves as a better and cheaper alternative to what the major cloud providers were offering.


3. Data Privacy is of Much More Concern to Stakeholders.

Another important aspect of the event was how companies were catering, especially to the banking and finance industries, by assuring complete data privacy to their customers. Amongst C-level executives from the finance world, the privacy and identity protection of their clients was one of the biggest challenges they were facing as fintech products were becoming more accessible to the public. 


This aspect was so important this year, apart from the traditional layers of big data technologies, that it had a dedicated theatre called 'Customer data and Privacy Theatre' to showcase what was in the market to service the predicament of leaking sensitive data. Industry leaders shared various strategies, best practices, and new software products to protect customer data and ensure maximum trust.

4. Querying/Working with Big Data is Getting Faster

Although tools like spark have existed for large-scale data processing, there are more tools now to store, query and process different types of data pretty fast. For example, time-series data can now be queried at blazingly fast speeds by the platform provided by a company called QuestDB. Their platform boasts of querying billions of records in a time-series data frame in milliseconds. Their platform was of particular interest to trading companies which analyse billions of data points from the stock/crypto market as they have access to very granular levels of data.


Another big data processing platform for machine learning and analytics featured extensively during the conference was Databricks, a unified web-based platform to work with spark. The Databricks team at the event had fascinating use cases and industry problems, which had been solved through their platform. As a vendor for spark, their platform is extremely popular for working with big-data technologies in an IPython-style notebook fashion.


5. Innovation in the Big-Data World is on The Rise

Microsoft, the innovation partner for the event, was at the forefront of displaying their HoloLens 2, an augmented reality headset marketed to help businesses from different sectors improve efficiency, onboarding, safety and productivity. The visitors to their stall could demo the product if they had the patience to stand in the crowded queue. 


Other innovative products featured during the event included specialised analytics platforms for the automobile, healthcare and legal industries. Seeing such inventive and revolutionising products on display excited me for the transformative phase that the big-data world is going through. The sheer number of start-ups and new technologies featured during the event was mind-boggling.

The Big Data LDN event will take place again in autumn next year. If so many things can change within the industry in one year, I can't wait to see what happens in the years to come. 

For more information, get in touch with our Development Team to find out more about our Data Science Services or book a meeting...

Book a Meeting