Giving Comment Spammers Less Incentive to Spam You

December 8, 2009 at 10:01 PMBen

The latest check-in of BE.NET (1.5.1.36) has a small, but important change.  The three themes included with BE.NET now include rel=”nofollow” on the links of commenter’s websites.

This is a theme specific change.  So if you’re using a custom theme, and even if you upgrade to the latest build of BE.NET, there’s a good chance you might not have the NOFOLLOW instruction on these links.  It can simply be added in the CommentView.ascx file in your theme’s blog folder.

As I’m using a custom theme myself, I just added NOFOLLOW to this blog too.  Wikipedia has a good writeup on NOFOLLOW, in case you aren’t familiar with its purpose.  I’m a little surprised it’s taken this long to get NOFOLLOW into the themes that are included with BE.NET.  Better late than never!

I get a lot of comment spam on this blog.  As I’m moderating comments, it ends up never showing up since I don’t approve any of it (TIP to spammers, stop wasting your time!).

Comment spammers are a lot more likely to leave comments on blogs that do not include NOFOLLOW.  Yes, I’m sure a lot of the spammers actually look at these types of details when scoping out blogs to attack.

Incidentally, the ResolveLinks extension that comes with BE.NET already includes the NOFOLLOW instructions.  This is the extension that will convert URLs in comments into hyperlinks.  If the extension finds a URL like www.google.com in the comment content, it will convert that into:

<a href="http://www.google.com" rel="nofollow">www.google.com</a>

This conversion is done as the comment is being served.

I’m anxious to see what type of impact adding NOFOLLOW will have on my level of comment spam.
Fingers crossed ...

Logging & Improved Error Reporting in BlogEngine

June 13, 2009 at 6:26 PMBen

There’s two new features in the latest build of BE, 1.5.1.11.  These features, Logging and Improved Error Reporting, are separate but related features.  I think both features will turn out to be very helpful – especially when trying to diagnose a problem.  I’ll explain both features and how they can be used independently of each other, as well as with each other.

Logging

There’s a new event handler in BE that any extension or other component can subscribe to: Utils.OnLog.  It can be subscribed to in an extension, like:

Utils.OnLog += new EventHandler<EventArgs>(OnLog);

In this case, there’s an OnLog event handler that will fire every time any piece of code in BE.NET logs a message.  I created a simple Logger extension that is now included in BE 1.5.1.11 that subscribes to log notifications, and writes the log message to a logger.txt file in the App_Data folder.  Anyone can write a similar extension that will save log messages to a database.  The code for the this new Logger extension can be viewed below.

#region using

using System;
using BlogEngine.Core;
using BlogEngine.Core.Web.Controls;
using System.IO;
using System.Text;

#endregion

/// <summary>
/// Subscribes to Log events and records the events in a file.
/// </summary>
[Extension("Subscribes to Log events and records the events in a file.", "1.0", "BlogEngine.NET")]
public class Logger
{
    static Logger()
    {
        Utils.OnLog += new EventHandler<EventArgs>(OnLog);
    }

    /// <summary>
    /// The event handler that is triggered every time there is a log notification.
    /// </summary>
    private static void OnLog(object sender, EventArgs e)
    {
        if (sender == null || !(sender is string))
            return;

        string logMsg = (string)sender;

        if (string.IsNullOrEmpty(logMsg))
            return;

        string file = GetFileName();

        StringBuilder sb = new StringBuilder();

        lock (_SyncRoot)
        {
            try
            {
                using (FileStream fs = new FileStream(file, FileMode.Append))
                {
                    using (StreamWriter sw = new StreamWriter(fs))
                    {
                        sw.WriteLine(@"*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*");
                        sw.WriteLine("Date: " + DateTime.Now.ToString());
                        sw.WriteLine("Contents Below");
                        sw.WriteLine(logMsg);

                        sw.Close();
                        fs.Close();
                    }
                }
            }
            catch
            {
                // Absorb the error.
            }
        }
    }

    private static string _FileName;
    private static object _SyncRoot = new object();

    private static string GetFileName()
    { 
        if (_FileName != null)
            return _FileName;

        _FileName = System.Web.Hosting.HostingEnvironment.MapPath(Path.Combine(BlogSettings.Instance.StorageLocation, "logger.txt"));
        return _FileName;
    }
}


Any code within BE.NET, a widget, extension, etc. can now log any message like:

Utils.Log("some message to log");


If more than one event handler is subscribed to the OnLog notifications, each event handler will of course fire.  It’s worth noting that Utils.Log() accepts a parameter of type object.  The Logger extension is designed to receive messages of a string type (it actually casts the object type parameter to a string type).  If Logger receives a non-string type, it doesn’t log the message – because the Logger extension is designed to receive simple string-based messages.  If an extension or other piece of code wants to pass a non-string type message to a logger, a different extension could be created that is equipped to handle log messages that are of a type different than string.

If you don’t want logging to take place, either because you don’t want to worry about having a log file that keeps growing, or because you prefer not to store data in the App_Data folder (as the Logger extension does), you can simply disable the Logger extension on the Extensions tab in the control panel.  As of right now, even if you leave the Logger extension enabled, there’s going to be virtually no messages logged as there isn’t any code that currently calls Utils.Log().

This new logging feature is going to help keep track of events going on.  Although logging can be used for many purposes, one of the reasons I wanted to have this in BE.NET was to be able to record unhandled errors.

Error Handling & Reporting

There’s now an event handler in the Global.asax file that catches all unhandled exceptions.  Up till now, if a 500 server error occurred, you would be redirected to error404.aspx, as defined by the <customErrors> tag in the web.config file.  While this is a nice catch-all method to handling errors, people just getting started with BE.NET are often confused why they are seeing a “Page cannot be found” message when they are trying to do something like save changes and an unhandled error occurs -- which they don’t know has occurred when all they see is a “Page cannot be found” message.  To be fair, most people just getting started with BE.NET are not getting errors.  But some do, and adjusting folder permissions or other settings fixes the errors they are seeing.  In order to fix the problem, one must first know an error is actually occurring, and they need to know what the actual error is!

The new Application_Error event handler in Global.asax does up to three things for non-404 errors:

  1. It generates a summary, including details, of the unhandled error that just occurred.
  2. If error logging is turned on (a new option, explained below), it makes a call to Utils.Log(), passing the error summary to any event handlers registered to receive log notifications.
  3. It does a Server.Transfer() to a new error.aspx page in the root of the BE.NET web folder.

For #1, the summary generated includes a stack trace, inner exceptions, and the URL the page occurred on (both Request.Url and Request.RawUrl).

The summary that is generated is stored in the Items collection of HttpContext.Current.  Why?  When Application_Error in Global.asax does a Server.Transfer to the new error.aspx page, the error.aspx page checks to see if the person is logged in (i.e. if they’re authenticated).  If they are logged in, error.aspx will display the error summary generated within Global.asax.  The error summary is retrieved out of the HttpContext.Current.Items collection.  If the person isn’t logged in, they just see a message similar to error404.aspx, indicating an “unexpected error has occurred”, and the developer will be tortured, blah blah. :-)

Displaying the error details directly on error.aspx for logged in users is helpful for two reasons.  (a) Immediate knowledge of the error, and (b) even if the new error logging option is turned off, you still can see the error message in your browser when you’re transferred to error.aspx.

I’ve mentioned this new error logging option twice now.  On the Settings tab in the control panel, in the Advanced Settings section, there is a new checkbox labeled “Enable error logging”.  By default, it’s turned off.  While this is turned off, the only notifications the Logger extension will receive (if you leave the Logger extension enabled) will be messages coming from some piece of code that explicitly makes a call to Utils.Log().  If you turn this new Enable Error Logging feature on, then when an unhandled exception occurs, Global.asax will pass the error details to Utils.Log() for any registered event handlers to deal with.

Result & Extensibility Possibilities

From version 1.5.1.11, any extension, widget or even BE.NET itself can now simply make a call to Utils.Log() to have any message logged.  Logging can be done with the built-in Logger extension, or messages can be logged to a database, sent via email, etc. by any other extension someone creates.  I’m also very excited about the new error handling mechanism which will give administrators a lot more information about errors that may be occurring in their blog.

Please download and test out the latest build.  If any problems show up or you have any ideas for improvements, you can leave a comment here, or post a message on the CodePlex discussion boards.

Web Farm Extension 1.0

May 10, 2009 at 12:13 AMBen

The caching of data in BlogEngine.NET becomes a problem when BE.NET is installed in a web farm.  When you add, edit or delete a post, that change is occurring on one machine within the farm as well as within the data store (App_Data or DB).  But the other servers within the farm are unaware there’s been a change in data.  Not until the data loaded in memory on these other servers clears out anywhere from minutes to hours to days later, the old set of data will continue to be shown to visitors hitting one of these other servers.

I created a WebFarm extension which may help some people in this situation.  I haven’t worked much with web farms, so I’m not sure how well this extension will work.  Any feedback is appreciated.  I was able to test this extension on Vista/IIS7 where I had two separate web applications pointing to the same physical BE.NET location on my machine.  Even in this situation, creating a new post within one application would normally result in the post NOT showing up in the other application.  This new Web Farm extension did solve the caching problem for this scenario.  I’m hoping this success will carry over to a Web Farm scenario.

There’s two files in the ZIP file download.  WebFarm.cs should go in the App_Code\Extensions folder.  The other file, webfarm_data_update_listener.ashx, should go in the root of your blog.

Once those files are in their correct locations, if you go to the Extensions tab in the control panel, you’ll want to click ‘Edit’ for this new WebFarm extension.

WebFarm Extension

I wanted the help box on the right side to include as much information as possible.  But as you can see, the Extensions page doesn’t currently handle long description boxes very well :)

The idea behind this extension is that if each server within the farm has a unique internal Ip address, this extension can notify each server that a change in data has occurred.  Not knowing a lot about web farms, this is the part I’m unsure is possible.  But it does some reasonable each server would have its own unique IP address.  If you’re using host headers, this extension may not work for you – unless you have a unique URL to each server within the farm.

The extension currently notifies the servers in the web farm when a new post or page has been created, updated or deleted.  Other data such as Settings, Profiles, Categories, Comments, etc. is not handled by this extension.  Or at least not in this version.

The webfarm_data_update_listener.ashx file you placed in the root of the blog is the handler that receives notifications when a change in Post or Pages has occurred.  The data passed to the handler includes the type of data changed (Post, Page), the type of change (Insert, Update, Deletion) and the ID of the Post or Page that has been inserted/updated/deleted.  Rather than the handler re-loading all the Post/Page data, it will insert, update or delete just the one piece of data that changed.  This is more efficient than re-loading all the data which could be taxing for those with a lot of blog data.

As described in the help area when adding the server Ip/Urls in the WebFarm extension, make sure the “Shared Key” you enter matches the key in the webfarm_data_update_listener.ashx handler file.  The default key in the handler is “blogengine”.  For security purposes, you may change the key in the ASHX handler.  But be sure the ASHX key matches the key you enter for each Ip/Url.

Also, because of this same data caching issue in web farms, after you enter all the web farm server Ip addresses into the WebFarm extension, you’ll want to re-start BE.NET so the extension data you just entered is detected by all the servers in your farm.  Updating the web.config file with any meaningless change will accomplish re-starting the blog application.  This is just a one-time requirement so all servers have the list of servers they need to notify when a post or page change has occurred.

I realize this extension has its limitations and isn’t a comprehensive solution to undesired caching in a web farm scenario.  But it does handle propagating changes in posts across the servers in the farm – one of the more important areas.  This extension is also easy to get started with in contrast to making various changes within BE.NET itself.  Again, any feedback on this extension is appreciated.

Download: WebFarmExtension_1.0.zip (3.26 kb)