Caching

Questions about the WURFL Cloud service.
cycology
Posts: 4
Joined: Fri Mar 23, 2012 9:45 am

Caching

Postby cycology » Fri Mar 23, 2012 9:54 am

Having a look around the code revealed caching appears to be implemented in the (PHP) client - can you describe what configuration is required, and whether it is working out of the box? We use APC, and obviously could also use file, if that's important.

Presumably caching means that for each distinct user agent string, one detection is performed (until the cache is cleaned/expired), right?

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Fri Mar 23, 2012 10:26 am

Hi cycology,

Caching is enabled by default in all the WURFL Cloud clients via cookies, so each visitor will only result in one call to our cloud service. For the Premium accounts, the PHP client has support for many more methods of caching like APC, Memcache and the Filesystem. The cookie cache is the only supported caching method in the Free, Basic and Standard accounts because it is the only way to correlate website visitors to detections. Also, cookie caching is faster than any other caching method for return visitors since no call to the cache provider is necessary. In our tests on low traffic sites (<10,000 visits/day), there is little benefit to using a shared cache since most of the user agents from different users are different anyway (even the same devices will have different UAs due to locale differences, firmware revisions, browser version, etc).
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!

cycology
Posts: 4
Joined: Fri Mar 23, 2012 9:45 am

Re: Caching

Postby cycology » Fri Mar 23, 2012 11:14 am

Thanks Steve, I'll see how we get on with Cookie for now then I guess. Though it was not what I was expecting, I'll bow to your greater knowledge :-)

cycology
Posts: 4
Joined: Fri Mar 23, 2012 9:45 am

Re: Caching

Postby cycology » Thu Apr 05, 2012 12:17 pm

Hi Steve,

I've just received a message saying we have gone over our free account limit which was pretty surprising given the small (tiny) number of genuine unique users we have each day so far... We're on over 15,000, on pace for 99,310 detections per month, and there is no way we are vaguely close to that many unique users - I wish :-)

The only way I figure so far this could be happening is if page requests from monitoring tools (e.g. Cloudkick, Pingdom), search engine robots and other such things are generating detections, perhaps because you count a cookie as unique and they do not store or provide a cookie (I am guessing that's the case as they would mostly have no need)?

No doubt some of those will have their own sensible user agent, and some others might present a 'fake' user agent that looks like a real browser.

What do you think could be happening - maybe its not these bots but something else generating these detections? Maybe there's a bug at your end and the stats are wrong? Or is there anything I can do at my end to solve this? Currently the client is called on each and every page via an include using $client->detectDevice();.

To be honest the most efficient thing I can think of is using my site's built-in caching as I would normally do for a function called so regularly with the same parameters and return value, but I would need to know how to deliberately pass the user agent string in order to do this (the return value each function call with the same user agent passed as a variable would be cached, so 50,000 visits from the same monitoring bot would generate one detection until the cache times out). It seems currently your PHP client collects the user agent from the environment itself after being called, but then you say above that you believe this is not efficient versus a cookie - I wonder if that is true in our case though, as something is not right, but then most sites these days are hit by all sorts of link checkers, monitoring bots etc - some under the site's control, some from elsewhere that we can do nothing about...

Obviously I can delve into the client code and find out how, or modify things myself, but I thought I'd check with you first, partly as that will save me time, and partly as there may be a better way of finding and fixing this problem...

Thanks,

Andy

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Thu Apr 05, 2012 1:32 pm

Hi Andy,

Thanks for the response. I can see the traffic coming from your servers quite clearly, but I think you are right that there is something else strange going on here. I will look into it and get back with you, but in the meantime, I will bump up your allowed detections so you are not automatically blocked.
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Thu Apr 05, 2012 4:16 pm

Hi Andy,

I can confirm that the vast majority of your traffic (>95%) is coming from bots :)

We are in the final stages of rolling out our logic that excludes bots from the detections count. We'll keep your detections limit increased until we resolve the issue.

Meanwhile, we've noticed that your server is not sending the original IP of your visitors, which we also use for bot exclusion. Could you post an example of your PHP code that shows how you are calling the cloud client (without your API Key please)? Specifically, I'd like to see how you're calling detectDevice:

Code: Select all

$client->detectDevice();
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!

cycology
Posts: 4
Joined: Fri Mar 23, 2012 9:45 am

Re: Caching

Postby cycology » Fri Apr 06, 2012 12:39 pm

Thanks Steve - will await your next update...

Interesting re the IP of visitors - I can't imagine what could be causing that... Here's the contents of our wurfl.php which is included in the template for the site (obviously there is potential to add some elimination of monitoring services etc by adding them to the initial if statement, but I imagine this would logically better be handled by you guys so others can benefit?):

Code: Select all

// Eliminates our own API from unnecessary detections if it calls a normal page for some reason
if(!strpos($_SERVER['HTTP_USER_AGENT'], 'Cycology')) {

        // Include the WURFL Cloud Client
        // You'll need to edit this path
        require_once 'REMOVED/libraries/wurfl/Client/Client.php';

        // Create a configuration object
        $config = new WurflCloud_Client_Config();

        // Set your WURFL Cloud API Key
        $config->api_key = 'REMOVED';

        // Create the WURFL Cloud Client
        $client = new WurflCloud_Client_Client($config);

        // Detect your device
        $client->detectDevice();

        // not currently used
        /*$cache = JFactory::getCache('wurfl');
        $cache->setLifeTime( 30 * 86400 );
        $cache->call( array($client, 'detectDevice') );*/

        if ($client->getDeviceCapability('is_wireless_device')) {
                $is_wireless = true;
                if(
                        stristr($_SERVER['HTTP_USER_AGENT'],'iPad') ||
                        stristr($_SERVER['HTTP_USER_AGENT'],'iPhone') ||
                        stristr($_SERVER['HTTP_USER_AGENT'],'iPod')
                ) {
                        $is_ios = true;
                } else {
                        $is_ios = false;
                }
                if ($client->getDeviceCapability('is_tablet')) {
                        $is_tablet = true;
                        $is_phone = false;
                } else {
                        $is_phone = true;
                        $is_tablet = false;
                }
        } else {
                $is_wireless = false;
                $is_phone = false;
                $is_tablet = false;
                $is_ios = false;
        }

} else {
        // Serve the API as we would a mobile phone (might as well benefit from reduced data transfer)
        $is_wireless = true;
        $is_phone = true;
        $is_tablet = false;
        $is_ios = false;
}

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Fri Apr 06, 2012 1:08 pm

Thanks for the post. We've added some bot rules that remove the vast majority of bots from counting towards detections, although it's not pushed into production quite yet. You are right that it would make things a lot faster if we did this client-side, but the problem is that we don't know what value to return if you do a getDeviceCapability() when there has been no query made to the server yet. In fact, until you make a call to the server, the client doesn't even know which capabilities you have, so there would be no way to even validate the capabilities you're requesting. One option would be to put an isBot() method or something in the client, so you could put that in you if() statement to prevent bot traffic from going to the cloud servers.
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!

Juan.Nin
Posts: 1
Joined: Fri May 11, 2012 11:46 am

Re: Caching

Postby Juan.Nin » Fri May 11, 2012 11:47 am

Hi!

I'm about to start testing the cloud service, and first took a look at the forum and found this thread.
Wanted to know then if the bot detection has already been added to production.

Thanks in advance.

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Fri May 11, 2012 12:58 pm

Indeed, traffic from bots that identify themselves is not counted.
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!

venkat_9099
Posts: 1
Joined: Mon Jul 23, 2012 12:15 am

Re: Caching

Postby venkat_9099 » Mon Jul 23, 2012 12:19 am

kamermans wrote:You are right that it would make things a lot faster if we did this client-side, but the problem is that we don't know what value to return if you do a getDeviceCapability() when there has been no query made to the server yet. In fact, until you make a call to the server, the client doesn't even know which capabilities you have, so there would be no way to even validate the capabilities you're requesting.
Hi Steve,

If once does getDeviceCapability() and no values are present in the cache, make a call to server and fill up the cache and return the values. I am working on this logic of caching as I cannot afford to make a call to your servers for each and every request. Hope some kind of client-side caching like this is implemented in the next releases.

kamermans
Posts: 393
Joined: Mon Jun 06, 2011 9:50 am

Re: Caching

Postby kamermans » Mon Jul 23, 2012 7:55 am

I think that you have misunderstood my quote, which is saying that if we were able to programmatically remove robot traffic from going to the cloud service, this would improve performance. With regards to caching in general, all the cloud clients come with a cookie or file-based caching system that prevents duplicate requests from the same user from going to our cloud service. The principle behind this concept is that we are charging for our plans based on the traffic of our customers. Unfortunately, a shared cache would mean we are charging on the diversity of your users. For this reason, using a custom caching system with the WURFL Cloud Service's Free, Basic and Standard plans is not permitted. If you would like to use a more high-performance, or custom caching system, you will need to upgrade to a Premium account, which includes a Cloud Client with a shared-memory cache provider.
Thanks,

Steve Kamerman
ScientiaMobile

Make sure you check out our WURFL Cloud, WURFL InSight and WURFL InFuze products!


Who is online

Users browsing this forum: No registered users and 25 guests