Gautam's Blog

The technical blog of Gautam!

Browsing Posts in On Email

Daniel Shiffman wrote a bit of code for Bayesian Filtering. Bayesian Filtering is being used to classify spam and ham based on the content. Recently, Twitter has been getting a lot of spam. I was wondering how Bayesian Filtering works with Twitter spam. Some Twitter data is available at: http://www.public.asu.edu/~mdechoud/datasets.html under creative common’s license.

Modified the code to process in Hadoop and let the system run. The results were not very encouraging. Because the datasets for email spam and twitter spam could be different. Hopefully, we have more spam and ham words available to classify twitter spam. Till then, this idea gotta wait.

Second level TLD

No comments

Effective TLD: http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/src/effective_tld_names.dat?raw=1

Mozilla TLD: https://wiki.mozilla.org/TLD_List

INTERNET FREEDOM LAW WILL KEEP INTERNET OPEN FOR FUTURE INNOVATORS

Internet Malicious Activity Maps

http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm

Read more here.

SANS: Report 1; Report 2

Our paper on Email reputation: tracking email reputation for authenticated sender identities has been accepted at Conference on Email and Anti-Spam 2008. The paper discusses our results based on our results from a single organization from Oct 2007 – March 2008.

http://forums.mysql.com/read.php?33,41056,61281#msg-61281

If by “analytic” you mean OLAP, data mining etc: there are a number of options.

OLAP servers:

* Mondrian: Java OLAP engine that uses most DBMSs, including MySQL
* Palo: spreadsheet server. Custom data source, Excel and PHP integrations
* Lemur: C++ OLAP server – academic

OLAP UIs

* JPivot: Web OLAP UI integrating with Mondrian and XML/A (Microsoft Analysis Services, Hyperion). JSP tags + portlets
* JRubik: Java thick client for Mondrian – based on JPivot
* pocOLAP: crosstab Web UI

Data mining, statistics

* Weka
* Project R

Reporting servers

* BIRT
* JasperReports Portal: JasperReports in JBoss Portal
* OpenReports: JasperReports based web application
* marvelIT: OpenReports in a portal

BI servers: Reporting and Analytics

* Pentaho
* SpagoBI
* OpenI
* Bizgres

ETL – to build data warehouses and operational data stores

* Octopus
* KETL – part of Bizgres
* Kettle
* CloverETL
* dbmt

LISA 07

Comments off

LISA '07

Our paper, “RepuScore: Collaborative Reputation Management Framework for Email Infrastructure” has been accepted at LISA 07 to be held in Dallas, TX.

Powered by WordPress Web Design by SRS Solutions © 2012 Gautam's Blog Design by SRS Solutions